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ABSTSACT 


Empirical  Orthogonal  Function  (ECF)  analysis  is  used  to 
represent  the  environmental  wind  forcing  of  selected  western 
North  Pacific  tropical  cyclone  tracks  from  1979-1983.  The 
EOF  analysis  is  applied  separately  to  the  zonal  and  meri¬ 
dional  wind  components  at  7 00,  h^O  and  250  mb  on  a  527-point 
grid  with  288.7  km  zonal  and  mecidional  spacing  that  is 
relocated  with  the  storm  center.  The  527  EOF  coefficients 
(for  each  level  and  component)  are  computed  for  a  sample  of 
682  cases.  The  coefficient  vectors  are  truncated  to  the 
first  35  coefficients  based  on  a  Honte  Carlo  selection 
criterion.  These  coefficients  account  for  at  least  82 
percent  of  the  variance  in  each  field.  The  EOF  coeffi¬ 
cients,  along  with  storm  movement  during  the  past  24  hours, 
position,  date  and  intensity,  are  then  used  as  potential 
predictors  in  a  regression  analysis  forecast  scheme  for 
tropical  cyclone  motion.  The  ECF-based  regression  equations 
are  tested  on  the  dependent  data  cases.  The  mean  72-hour 
track  forecast  error  is  between  450  and  500  km.  Therefore, 
it  appears  that  this  regression  scheme  has  potential  for 
operational  applications. 
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I.  IN TROD  OCT ION 


A.  BACKGROUND 

This  study  is  related  to  one  of  the  most  difficult  prob¬ 
lems  in  tropical  meteorology-- to  forecast  the  movement  of 
tropical  cyclones.  In  discussing  the  impact  of  weather  on 
naval  forces,  materials  and  operations.  Wells  (1982)  empha¬ 
sized  the  role  of  tropical  cyclones.  Avoidance  of  tropical 
cyclones  is  important  to  both  military  and  civilian  popula¬ 
tions.  Fleet  operating  orders  contain  lengthy,  explicit 
guidance  on  tropical  cyclone  evasion.  Yet,  serious  losses 
due  to  tropical  cyclones  continue  to  occur.  3ecause  of  the 
potential  devastation  of  life  and  property,  continued 
improvement  in  the  ability  tc  forecast  tropical  cyclone 
movement  is  imperative.  The  guidance  to  avoid  storm  damage 
is  available,  but  precautions  must  be  taken  early.  This 
requires  accurate  tropical  cyclone  forecast  methodology. 

After  George  and  Gray  (1976),  tropical  cyclone  movement 
prediction  models  can  be  classified  into  four  categories: 
(1)  steering  flow;  (2)  statistical;  (3)  numerical;  and 
(4)  climatology-persistence.  The  steering  concept  treats 
tropical  cyclones  as  vortices  embedded  in  the  basic  environ¬ 
mental  flow.  The  statistical  forecast  approach  commonly 
uses  a  screening  procedure  to  select  meteorological  vari¬ 
ables  that  are  correlated  with  tropical  cyclone  movement. 
These  variables  are  then  used  to  develop  regression  equa¬ 
tions  for  prediction.  An  analog-statistical  model  is  based 
upon  the  assumption  that  historical  families  of  repetitive 
storm  tracks  are  associated  with  repetitive  synoptic 
patterns.  By  scanning  historical  data  records,  a  computer 
algorithm  is  used  to  associate  an  existing  storm  with  a 


"parent"  storm  track  or  with  a  family  of  similar  storms. 
The  numerical  method  involves  predictions  of  the  synoptic 
flow  surrounding  a  cyclone,  and  possibly  a  simulation  of 
cyclone  structure,  to  predict  storm  movement.  Prediction  of 
tropical  cyclone  movement  based  on  climatology  and/or 
persistence  is  based  upon  empirical  relationships  derived 
from  historical  records  of  the  tracks  of  previous  cyclones. 
Objective  methods  for  forecasting  tropical  cyclone  movement 
have  been  developed  using  one  or  more  of  these  prediction 
models.  As  yet,  no  one  of  these  objective  techniques  has 
been  found  to  be  superior  to  the  others  under  all  conditions 
(e.  g.,  Neumann  and  Pelissier,  1981). 

The  simplest  numerical  method  of  predicting  tropical 
cyclone  movement  is  to  use  a  barotropic  model  on  a  rela¬ 
tively  coarse  grid  (Sanders  and  Burpee,  1968)  with  a  point 
vortex  advection  scheme  (Renard,  1963)  .  Results  obtained 
from  these  methods  demonstrated  that  there  is  considerable 
information  in  the  analyzed  and  predicted  synoptic  fields 
represented  on  grids  which  lack  the  fine  resolution  neces¬ 
sary  to  resolve  the  intense  wind  field  near  the  center  of  a 
tropical  cyclone.  However,  Ley  and  Elsberry  (1976)  cited 
these  models  as  inadequate  due  to  the  lack  of  a  unique 
steering  level  (or  layer)  and  the  absence  of  vortex- 
environmental  interaction.  Still,  the  relative  success  of 
coarse-mesh  models  supports  the  idea  that  it  might  be 
possible  to  relate  large-scale  forcing  (by  aavective 
processes)  of  a  tropical  cyclone  to  its  subsequent  movement. 

Current  statistical  models  for  the  prediction  of  trop¬ 
ical  cyclone  movement  use  predictors  derived  from  clima¬ 
tology,  persistence  and  either  observed  or  numerically 
forecast  geopotential  height  data  (such  as  gradients,  thick¬ 
nesses  and  time  changes).  For  example,  Neumann  and 
Randrianarison  (1976)  developed  a  purely  statistical  model 
based  on  a  system  of  regression  equations  for  the  prediction 
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of  tropical  cyclone  movement  over  the  Southwest  Indian 
Ocean.  Basically,  the  model  is  CLImatology  plus  PEF.sistence 
(CLIPEE) ,  applied  to  the  Indian  Ocean.  Stepwise  regression 
was  used  to  develop  second-order  polynomials  (35  variables) 
to  predict  the  zonal  and  meridional  cyclone  displacements. 
The  resultant  model's  performance  compared  favorably  with 
operational  models  (Jcint  Typhoon  Warning  Center,  1983).  A 
significant  number  of  North  Atlantic  tropical  cyclones 
exhibit  anomalous  motion  characteristics  (Neumann,  198  1). 
The  forecast  tracks  of  these  storms  revealed  limitations  of 
purely  statistical  forecast  systems  (Neumann  and  Lawrence, 
1975) .  While  some  researchers  sougnt  to  develop  purely 
dynamical  models  (e.  g. ,  Miller  et  al. ,  1972),  others  devel¬ 
oped  statistical-dynamical  models.  The  current  NHC 
statistical-dynamical  model,  NHC73,  was  described  by  Neumann 
and  Lawrence  (1975).  The  results  demonstrated  that  informa¬ 
tion  obtained  from  numerical  prognoses  can  improve  the 
performance  of  statistical  tropical  cyclone  prediction 
models. 

Statistical  models  for  the  prediction  of  tropical 
cyclone  movement  have  traditionally  used  a  coordinate  system 
oriented  with  respect  to  the  zonal  and  meridional  axes. 
Tropical  cyclones  tend  to  move  with  the  synoptic  flow. 
Short-term  displacements  have  a  very  strong  persistence 
component.  For  these  reasons,  a  grid  system  oriented  with 
respect  to  the  cyclone's  heading  would  be  a  natural  choice. 
Shapiro  and  Neumann  (1984)  investigated  the  error-reducing 
potential  of  a  grid  system  oriented  with  respect  to  the 
cyclone  heading.  This  grid-reorientation  technique  resulted 
in  a  40  percent  reduction  of  the  total  variance  of  tropical 
cyclone  movement.  It  was  shown,  using  the  dependent  data 
sample,  that  a  potential  reduction  of  24-hour  forecast 
errors  by  approximately  13  percent  could  be  realized  for 
synoptic  predictors  extracted  on  a  rotated  grid.  This 


redaction  in  error  was  comparable  to  the  redaction  in 
24-hour  forecast  errors  during  the  past  25  years  (Shapiro 
and  Neumann,  1984).  It  was  observed  that  the  entire  reduc¬ 
tion  of  forecast  error  is  not  realizable  due  to  random  and 
real  errors  in  the  developmental  and  operational  height 
data,  respectively.  Satisfactory  results  were  not  obtained 
using  rotated  grids  for  prediction  of  48-  and  72-hour  trop¬ 
ical  cyclone  movements.  An  analysis  of  forecast  results 
revealed  that  grid  rotation  optimized  forecasts  in  the 
direction  along  which  the  variance  of  tropical  cyclone  move¬ 
ment  is  maximized  and  tended  to  orient  the  displacement 
vectors  with  the  alcng-track  direction.  The  results  of 
Shapiro  and  Neumann  (1984)  indicated  the  potential  forecast 
improvement  that  can  be  made  in  short-term  forecasts  with 
current  synoptic  data  if  the  cyclone’s  heading  is  known. 
However,  these  concepts  must  still  be  tested  in  an  opera¬ 
tional  environment.  For  this  reason,  the  data  grid  used  in 
this  study  was  geographically  oriented.  This  grid  system 
will  be  described  in  the  next  chapter. 

Both  statistical  and  dynamical  methods  have  weaknesses 
(Haltiner  and  Williams,  1980;  Shaffer,  1982).  Statistical 
methods  usually  do  not  forecast  well  those  cyclones  that 
have  anomalous  motions.  This  problem  relates  to  the  "scope" 
of  a  model,  as  discussed  in  Chapter  V.  Similarly,  these 
methods  are  typically  not  robust  against  small  changes  in 
the  synoptic  (dynamic)  forcing  of  a  cyclone.  There  is  a 
general  tendency  of  statistical  methods  toward  homogenized, 
or  smoothed,  forecasts.  In  comparison,  dynamical  models 
suffer  from  both  theoretical  and  financial  limitations.  Due 
to  the  smallness  of  the  Coriolis  parameter  in  tropical 
regions,  geostrophy  cannot  be  assumed  and  initialization  of 
data  fields  is  difficult.  Erroneous  data  used  to  initialize 
a  model  can  rapidly  deteriorate  a  numerical  forecast. 
Convective  heating  is  one  of  the  primary  driving  mechanisms 
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for  the  maintenance  of  a  tropical  cyclone.  The  difficulty 
of  modeling  convective  heating,  together  with  initialization 
problems,  makes  dynamical  model  predictions  suspect  in  the 
tropics.  More  importantly,  maintenance  of  the  energy 
balance  for  a  tropical  cyclone  requires  an  interaction  among 
different  scales  of  motion  (Coyama,  1982).  To  avoid 
spurious  solutions,  a  small  grid  mesh  is  necessary  for  a 
dynamical  model  to  numerically  simulate  these  interactions. 
Furthermore,  the  expense  of  numerical  integration  on  a  fine 
mesh  can  be  quite  large  due  to  the  Courant-Fr edrichs-Levy 
(CFL)  condition,  which  requires  integration  to  be  made  with 
smaller  time  steps  as  the  mesh  is  decreased  (Haltiner  and 
Williams,  1980).  An  additional  difficulty  encountered  with 
a  fine-mesh  model  is  that  observed  data  in  tropical  regions 
are  inadequate  for  model  initialization. 

Neumann  and  Pelissier  (1931)  studied  the  performance 
characteristics  of  various  tropical  cyclone  movement 
prediction  models  in  operational  use  at  the  National 
Hurricane  Center  (NHC)  in  Miami,  Florida.  These  models  are 
representative  of  the  current  methodology  for  prediction  of 
tropical  cyclone  movement.  The  seven  models  range  in 
complexity  from  the  basic  analog  to  the  sophisticated  numer¬ 
ical  and  are  identified  in  Table  I  as  statistical, 
statistical-synoptic,  statistical-dynamical  or  dynamical. 
Four  of  the  statistical  schemes  are  regression-equation 
models.  Predictors  for  these  equations  are  generally 
derived  from  climatology,  persistence  and  geopotential 
height  data  (except  CLIPER) .  A  fifth  statistical  model 
(l’fRF»N)  uses  an  analog  approach.  Operational  analysis  of 
CLI ?£,.  d  HURRAN  (Hope  and  Neumann,  1970)  showed  that  each 
of  thesi  odels  gives  almost  identical  forecast  tracks. 

For  '  recasting  western  North  Pacific  tropical  cyclones, 
five  ma  n  categories  of  objective  techniques  are  used  by  the 
Joint  T>  .hoon  Warning  Center  (JTWC)  ,  3uam,  Marianas  Islands: 


where  a(i,j)  is  the  corresponding  element  of  matrix  A,  and 
b(i)  and  s  (i)  are  respectively  the  mean  and  standard  devia¬ 
tion  of  the  elements  in  row  i  of  matrix  A  (that  is,  the  mean 
and  standard  deviation  at  a  particular  point  of  the 
equidistant  grid  computed  over  ail  cases.)  The  elements  of 
Z  are  dimensionless  variates  of  zero  mean  and  standard  devi¬ 
ation  one.  The  main  advantage  of  using  standardized  data  is 
that  it  effectively  treats  the  systematic  variation  in 
magnitude  of  the  elements  of  the  data  matrix  A.  This  is 
beneficial  for  the  reasons  given  in  Chapter  II.  The  same 
systematic  error  can  occur  with  the  use  of  the  covariance 
matrix,  which  also  introduces  the  need  for  dimensional 
scaling  to  return  to  the  form  of  the  input  data  prior  to 
interpretation  of  the  eigenvectors.  A  disadvantage  of  using 
the  correlation  matrix  is  potential,  but  slight,  smoothing 
of  the  results  (Kutztach,  1967). 

The  correlation  matrix  (R)  then  is  the  symmetric  matrix: 


R  =  ZZ'/n 


(3.2) 


where  n  is  the  number  of  cases,  and  a  prime  is  used  to 
denote  the  transpose  of  a  matrix  or  vector.  Next,  it  is 
necessary  to  determine  the  following  constrained  maximum: 


Max  [y:  e’e  =  1}  where  y  =  e '  Ee 


(3.3) 


for  the  m  dimensional  column  vector  e.  The  scalar  y  is  the 
correlation  between  vector  e  and  the  data  matrix  A.  The 
constraint  requires  that  the  vector  e  be  normalized  to 
length  one.  Morrison  (1967)  applies  the  method  of  Lagrange 
multipliers  to  (3.3)  to  obtain: 


(  R  -  vl  )  e 


(3.4) 
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Pacific  wind  vectors.  The  complex  EOF  results  linked  the 
spatial  and  temporal  patterns  of  the  data  fields.  This 
fusion  of  space-time  variations  is  particularly  useful  for 
long-term  records  over  large  spatial  areas.  The  temporal 
variance  of  the  data  was  partitioned  into  orthogonal  spatial 
patterns  (the  eigenvectors).  The  complex  coefficients 
computed  were  shown  to  be  a  time  series  modulating  the 
eigenvectors  which  were  associated  with  physical  patterns 
(signals)  that  accounted  for  a  large  percentage  of  the  total 
variance.  Legler  further  demonstrated  that  it  is  possible 
to  obtain  statistical  information  that  could  not  be  obtained 
using  a  scalar  analysis  of  the  wind  components. 

Whether  to  perform  a  scalar  or  a  vector  EOF  analysis  is 
a  fundamental  consideration.  Kjelass  (1971)  has  edited 
several  articles  on  the  theory  and  methodology  of  scalar  and 
vector  EOF  analysis.  These  articles  include  illustrative 
examples  of  the  application  of  the  methodology  in 
geophysics.  For  vector  data,  a  more  realistic  representa¬ 
tion  would  be  expected  from  a  vector  EOF  analysis,  as  demon¬ 
strated  by  Legler  (  1983).  However,  it  would  be  rash  to 
assume  that  a  vector  analysis  is  necessarily  best  for  vector 
data.  As  discussed  in  Chapter  IV,  the  meridional  and  zonal 
wind  components  comprising  the  data  for  this  study  were 
subjected  to  a  scalar  EOF  analysis.  The  mathematical  proce¬ 
dure  for  this  analysis  is  described  in  the  next  section. 

B.  THE  EOF  METHOD 

Let  A  be  an  m  x  n  matrix  containing  n  cases  of  m-variate 
data.  The  following  development  will  be  for  the  scalar  EOF 
analysis  of  the  standardized  data  matrix  Z  with  elements 
z  (i ,  j )  defined  by : 

z(i,j)  =  [  a  (i,  j)  -  b  (i)  ]  /  s(i)  ,  (3.1) 
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meteorological  forcing  patterns.  Because  of  their  inherent 
empirical  nature,  it  is  not  reguired,  and  hence  not  always 
found  to  be  the  case,  that  the  eigenvectors  have  a  physical 
interpretation  that  accounts  for  any  variation  of  the  field 
being  analyzed. 


Application  of  the  EOF  methodology  to  wind  data  has  been 
done  in  various  manners.  Fcr  example,  Barnett  (1977) 
applied  an  EOF  analysis  to  Pacific  trade  wind  data  separated 
into  zonal  and  meridional  components.  The  result  was  an 
analysis  of  two  separate  scalar  fields.  Alternately,  the 
treatment  of  the  wind  as  complex  numbers  for  the  EOF  tech¬ 
nique  was  presented  by  Hardy  (1977).  Hardy  and  Walton 
(1978)  analyzed  mesoscale  wind  vector  measurements  at  ten 
stations  in  the  San  Francisco  Bay  Area.  This  report 
included  a  useful  mathematical  appendix  describing  the  anal¬ 
ysis  cf  complex  (that  is,  vector)  data,  since  EOF  analysis 
of  two-dimensional  vector  data  is  achieved  by  use  of  complex 
rather  than  real  numbers.  This  extension  of  the  methodology 
is  straightforward.  The  time  series  analysis  of  the 
temporal  component  patterns  was  also  illustrated.  Results 
of  this  study  confirm  that  E CF  analysis  can  be  advanta¬ 
geously  applied  to  large  sets  of  regional  wind  velocity 
data.  The  method  objectively  derived  the  essential  spatial 
and  temporal  properties  represented  by  the  data,  and  enabled 
a  guantitative  development  of  "prototype"  cases  and  a  quan¬ 
titative  comparison  of  regional  velocity  patterns  on  a 
mont h-to-month  basis.  This  is  similar  to  the  application  of 
EOF  analysis  for  map  typing.  Fcr  example.  Brown  (1981)  used 
EOF  methods  to  divide  height  fields  surrounding  tropical 
cyclones  into  smaller  classes  based  on  the  derived  coeffi¬ 
cients.  These  classes  were  used  for  an  analog  scheme  to 
forecast  tropical  storm  movement. 

Legler  (1983)  applied  the  method  of  Hardy  and  Walton 
(1978)  to  18  years  of  monthly-average  records  of  tropical 
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expansion  coefficients  are  a  time  series  representation  of 
these  temporal  patterns  (Hardy  and  Walton,  1978;  Legler, 
1933)  . 

Examples  of  the  use  of  eigenvectors  (eigenmodes)  in 
meteorological  applications  include  those  of  Lorenz  (1956) 
in  statistical  weather  prediction,  Srimmer  (1963)  in  an 
analysis  of  temperature  patterns  in  Europe,  and  Mateer 
(1965)  in  an  analysis  of  observations  of  ozone  distribution 
from  sky-light  intensities.  Many  other  studies  can  be  found 
in  the  meteorological  literature.  Hardy  and  Walton  (1978) 
gave  a  broad  survey  cf  possible  applications  of  the  analysis 
of  scalar  data.  The  mathematical  details  of  the  scalar  EOF 
method  are  described  in  the  next  section. 

There  are  particular  advantages  afforded  by  an  EOF  data 
analysis.  It  is  not  necessary  for  the  data  to  be  stationary 
(in  a  statistical  sense)  ,  nor  do  they  have  to  be  uniformly 
sampled  in  space  or  time.  The  EOF  method  is  a  convenient, 
cost-ef f ective  and  objective  means  to  represent  large 
amounts  of  synoptic  data  by  comparatively  few  coefficients. 
While  numerical  storage  is  net  normally  a  problem  with 
modern  computers,  it  is  important  that  the  researcher  be 
able  to  represent  synoptic  fields  in  a  "compact”  manner. 
Also,  these  coefficients  can  be  readily  incorporated  into  a 
regression  analysis.  Kutzbach  (1967)  gives  a  particularly 
clear  description  of  an  EOF  analysis  that  was  used  to  reduce 
23  temperature  observations  at  25  grid  points  to  five  eigen¬ 
vectors  which  accounted  for  88  percent  of  the  total  varia¬ 
tion.  Similarly,  Stidd  (1967)  performed  an  EOF  study  of  the 
average  monthly  rainfall  in  Nevada  and  was  able  to  account 
for  93  percent  of  the  total  variance  using  only  three  eigen¬ 
vectors  and  coefficients.  The  eigenvectors  were  success¬ 
fully  associated  with  factors  related  to  rainfall.  These 
examples  demonstrate  the  effective  use  of  EOF  analysis  for 
data  reduction  and  for  possible  identification  of 
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III.  EMPIRICAL  ORTHOGONAL  FONCTIONS 


A.  BACKGROUND 

The  general  application  of  eigenvectors  in  an  EOF  anal¬ 
ysis  is  similar  to  the  representation  of  a  field  in  terms  of 
orthogonal  functions.  While  orthogonal  functions  are  gener¬ 
ally  simple  functions  such  as  sines  and  cosines,  eigenvec¬ 
tors  are  derived  from  the  data  fields.  After  suitable 
ranking,  a  few  eigenvectors  may  represent  a  significantly 
higher  proportion  of  data  variance  than  would  the  same 
number  of  orthogonal  functions.  The  statistical  methods 
known  as  principal  component  analysis  and  empirical  orthog¬ 
onal  function  analysis  (also  referred  to  as  empirical  eigen¬ 
vector  analysis)  are  in  essence  the  same.  The  principal 
components  are  the  same  coefficients  that  would  be  derived 
from  an  EOF  analysis. 

The  EOF  analysis  is  an  objective,  mathematical  procedure 
which  starts  with  either  the  correlation  or  covariance 
matrix  of  the  original  data  matrix.  From  the  cross-product 
matrix,  the  eigenvalues  and  eigenvectors  are  derived.  The 
normalized  eigenvectors  form  a  complete  orthonormal  basis  of 
vectors  which  can  be  used  to  represent  the  original  observa¬ 
tions.  It  will  be  shown  that  the  relative  magnitudes  ox  the 
eigenvalues  can  be  used  to  rank-order  the  eigenvectors 
(modes)  in  terms  of  their  significance  in  representing  the 
data.  Furthermore,  the  most  significant  eigenvectors  (that 
is,  those  which  represent  the  greatest  percentage  of  vari¬ 
ability  in  the  data)  can  often  be  identified  with  physically 
important  patterns  in  the  original  data.  While  not  impor¬ 
tant  for  this  study,  it  is  noted  that  data  containing  recur¬ 
rent  temporal  variations  have  spatial  eigenvectors  whose 


tropical  cyclone  maintenance  (Gray,  1979) .  The  vertical 
shear  of  the  mean  zonal  wind  near  the  cyclone  center  is  not 
large  and  changes  sign  across  the  center.  The  shear  is 
positive  to  the  poleward  side  and  negative  to  the  equator- 
ward  side  of  the  cyclone.  Also,  the  line  of  zero  zonal 
vertical  shear  crosses  near  the  cyclone  center. 

These  mean  wind  fields  thus  show  that  the  G3A  are 
capable  of  representing  the  flew  around  tropical  cyclones. 
In  the  next  chapter,  the  method  of  using  EOFs  to  represent 
this  flow  for  all  the  cases  in  the  sample  will  be  described. 


are  shown  in  Figs.  1-6.  The  means  and  standard  deviations 
were  fcased  on  all  682  cases.  As  shown  in  Figs.  1-6,  the 
variability  of  the  winds  is  largest  in  the  northeast  goad- 
rant  of  the  eguidistant  grid  at  all  three  levels.  3ecause 
the  variability  of  wind  speed  is  not  uniform  throughout  the 
grid,  standardization  of  the  winds  by  the  mean  and  standard 
deviation  at  each  grid  point  is  essential  to  ensure  that 
regions  of  the  grid  where  variability  is  generally  higher  do 
not  "dominate"  in  an  EOF  analysis.  The  standardization  of 
data  will  be  presented  Chapter  III. 

The  mean  zonal  and  meridional  flow  patterns  at  700  mb 
appear  to  be  physically  reasonable.  The  mean  zonal  flow  in 
Fig.  1  shows  easterlies  (westerlies)  to  the  north  (south)  of 
the  storm  center.  Although  the  grid  resolution  does  not 
reveal  the  fine  structure  of  the  storm,  the  cyclonic  envi¬ 
ronment  of  the  storm  is  evident.  The  mean  meridional  wind 
component  in  Fig.  2  is  dominated  by  southerly  (northerly) 
flow  to  the  east  (west)  of  the  storm.  Again,  the  cyclonic 
shear  envelope  of  the  storm  can  be  clearly  identified.  The 
mean  zonal  wind  fields  (Figs.  1,  3  and  5)  show  significant 
strengthening  of  the  westerlies  north  of  the  storm  from 
700  mb  to  250  mb.  The  strong,  positive  meridional  flow 
northeast  of  the  storm  at  400  mb  and  250  mb  (Figs.  4  and  6) 
could  be  an  indication  of  a  possible  outflow  channel,  which 
has  been  shown  to  be  favorable  for  tropical  cyclone 
intensification  (Chen  and  Gray,  1984). 

The  low-level  cyclonic  and  upper-level  anticyclonic 
circulations  in  the  mean  wind  fields  are  generally  represen¬ 
tative  of  mostly  mature  cyclones.  It  is  recognized  that 
computation  of  the  mean  fields  was  not  restricted  to  cases 
for  which  the  developing  cyclone  had  matured  to  tropical 
storm  intensity  or  to  cases  of  intensifying  cyclones. 
Nevertheless,  Figs.  1,  3  and  5  indicate  patterns  of  the 
vertical  shear  of  zonal  wind  that  have  been  associated  with 


intensity  (maximum  sustained  winds  of  18  m/s  (35  kts)  or 
greater)  must  have  been  present  west  of  the  dateline.  The 
JTWC  warning  position  at  the  times  the  GBA  were  produced 
must  have  been  at  a  latitude  less  than  34.6  N  to  ensure  that 


data  were  available  for  a  sufficient  latitudinal  extent 
north  of  the  cyclone  center.  Finally,  the  GBA  must  be 
available  for  the  zonal  and  meridional  wind  components  at 
700  mb,  400  mb  and  250  mb. 

A  total  of  1357  cases  were  found  to  meet  the  above 
criteria.  Because  of  computation- time  limitations  subse¬ 
quently  encountered,  the  initial  data  set  was  later  reduced 
to  682  cases  by  random  selection.  These  682  cases  comprised 
the  data  set  from  which  the  EOF  functions  were  computed. 
However,  all  682  cases  were  not  suitable  for  the  regression 
analysis  due  to  an  inadequate  history  or  future  storm 
record.  The  selection  of  cases  for  the  regression  analysis 
will  be  described  in  Chapter  7. 

A  relocatable  527-point  grid  was  defined  with  a  fixed 
zonal  and  meridional  separation  of  277.8  km  (150  n  mi).  The 
grid  in  Fig.  1  is  typical.  There  are  31  grid  points  west  to 
east  and  17  south  to  north.  The  horizontal  resolution  is 
twice  that  of  Shaffer  (1982),  with  about  4  1/3  times  the 
number  of  grid  points  (527  vice  120) .  The  equidistant  grid 
extends  8334  km  (4500  n  mi)  zocally  and  4445  km  (2400  n  mi) 
meridionally.  The  grid  is  moved  for  each  case  so  that  the 
tropical  cyclone  center  is  always  located  at  the 
(0,0)  grid  point.  Fcr  each  case,  the  zonal  and  meridional 
wind  speeds  at  700  mb,  400  mb  and  250  mb  were  were  extracted 
from  the  GBA  onto  the  equidistant  grid  using  a  bilinear 
interpolation  method  (on  a  spherical  Earth) .  The  warning 
position  from  the  JTWC  was  used  to  locate  the  cyclone 
cent  er. 

Contours  of  the  mean  and  standard  deviation  fields  of 
the  zonal  and  meridional  winds  at  700  mb,  400  mb  and  250  mb 
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II.  DATA  ACQUISITION  AND  FIELD  DEFINITION 

Wind  data  used  ir.  the  present  study  are  from  the  Global 
Band  Analyses  (GBA)  ,  which  are  operationally  generated  by 
the  United  States  Navy  Fleet  Numerical  Oceanography  Center 
(FNOC).  The  GBA  are  produced  cn  a  49  x  144  Mercator  grid. 
At  22.5  N  or  S,  the  grid  mesh  distance  is  257  km.  The  GBA 
provide  complete  longitudinal  coverage  over  latitudes 
40.956  S  to  59.745  N.  Grid  points  are  always  separated  by 
2.5  degrees  of  longitude.  However,  convergence  ox  the 
meridians  causes  the  actual  zonal  distance  separating  grid 
points  to  decrease  toward  higher  latitudes.  Along  the 
northern  boundary  of  the  GBA  grid  from  58.462  N  to  59.745  N, 
the  longitudinal  separation  of  the  grid  points  undergoes  a 
3.7  percent  decrease.  This  should  not  be  an  important 
source  of  error  given  the  inherent  uncertainties  of  the  raw 
data.  Ihe  GBA  were  available  for  the  period  0000  GMT  5 
January  1975  to  1200  GMT  31  December  1983.  Data  were  avail¬ 
able  at  0000  GMT  and  1200  GMT  for  the  zonal  and  meridional 
wind  components  at  the  following  levels:  surface, 
700  mb,  400  mb,  250  mb  and  200  mb.  It  is  noted  that  the  GBA 
are  missing  for  some  dates  and  times  at  one  or  more  levels. 

Data  for  western  North  Pacific  tropical  cyclones  are 
available  from  the  annual  tropical  cyclone  reports  of  the 
JTWC.  At  six-hour  intervals,  warning  position,  best  track 
position,  estimated  intensity  (maximum  sustained  wind  speed 
and  minimum  surface  pressure)  are  given,  as  well  as  fore¬ 
casts  for  24,  48  and  72  hours.  The  JTWC  annual  reports  for 
the  years  1979  to  1983  were  used  to  select  the  cases  used  in 
this  study.  To  apply  the  technique  proposed  in  Chapter  I, 
the  following  conditions  for  case  selection  were  imposed.  A 
tropical  cyclone  which  matured  to  at  least  tropical  storm 


are  determined  by  regression  equations  for  the  orthogonal 
components  of  motion.  Chapter  71  addresses  the  important 
question  of  applicability  of  the  model  for  independent  data. 
The  concluding  Chapter  VII  contains  suggestions  for  further 
research . 


an  ECF-based  regression  approach  can  provide  a  simple, 
low-cost  technique  for  prediction  of  tropical  cyclone 
motion. 

The  extent  to  which  the  surrounding  flow  can  be  used  to 
predict  tropical  cyclone  movement  has  been  explored  by 
studies  such  as  Shaffer  and  Elsberry  (1982)  and  is  a  key 
motive  for  this  study.  The  motion  of  a  tropical  cyclone  is 
not  determined  solely  by  forces  acting  on  one  pressure  level 
but  rather  by  the  mean  wind  flew  integrated  through  a  deep 
layer  and  over  a  substantial  area  surrounding  the  cyclone 
(Wilier  and  Moore,  1960)  .  Because  a  single  steering  level 
has  not  been  established,  these  regression  studies  involve  a 
single- level  model  that  is  tested  with  predictors  extracted 
on  three  different  levels.  The  primary  purpose  of  the 
current  study  is  to  use  analyzed  wind  fields  to  represent 
synoptic  forcing  in  a  tropical  cyclone  movement  forecast 
technique.  3oth  Shapiro  and  Neumann  (1984)  and  Shaffer  and 
Elsberry  (1992)  worked  with  geopotential  height  data. 
Because  the  wind  fields  are  generally  more  representative  of 
the  flow  ir.  the  tropics,  it  is  hypothesized  that  a  study 
similar  to  Shaffer's  using  wind  data  could  result  in  further 
improvement  of  forecast  ability. 

The  techniques  that  have  been  applied  are  not  new.  The 
uniqueness  of  the  new  forecast  scheme  is  the  use  of  an  EOF 
representation  of  the  wind  forcing  in  the  prediction  of 
tropical  cyclone  movement.  This  forecast  method  can  be 
described  as  a  statistical-climatological  tropical  cyclone 
forecast  method  which  uses  an  EOF  representation  of  the 
synoptic-scale  wind  forcing. 

Chapter  II  discusses  the  acquisition  of  data  and  the 
grid  system  used.  The  EOF  methodology  and  analysis  are 
described  in  Chapters  III  and  IV.  In  Chapter  V,  the 
resultant  equations  from  a  regression  analysis  are  used  to 
develop  a  prototype  forecast  scheme.  Future  storm  positions 


the  synoptic-scale  features  adjacent  to  the  tropical 
cyclone.  One  approach  is  to  consider  the  cyclone  to  be  a 
point  vortex  whose  direction  and  speed  are  approximated  by 
the  direction  and  speed  of  the  surrounding  winds  (or,  equiv¬ 
alently,  the  pressure  or  height  gradients  across  the 
cyclone) .  The  steering  level  is  that  pressure  level  at 
which  the  wind  speed  and  direction  best  correlate  with  those 
of  the  cyclone.  The  steering  level  theory  has  been  applied 
in  several  tropical  cyclone  movement  forecast  schemes;  for 
example,  Riehl  and  Shafer  (1944),  Hiller  and  Moore  (1960), 
Tse  (1966)  and  Renard  et  al.  (1973).  Different  steering 
levels  are  used  by  the  various  forecast  schemes.  However, 
the  general  concensus  is  that  the  mid-tropospheric  levels 
(700  mb  and  500  mb)  are  the  best  for  predicting  tropical 
cyclone  movement  (Chan  and  Gray,  1982).  The  upper  tropos¬ 
pheric  level  winds  have  not  been  found  to  be  useful  for 
tropical  cyclone  movement  prediction  (Jordan,  1952;  Miller, 
1958)  . 

Statistical  regression  equations  were  developed  by 
Shaffer  (1982)  to  predict  the  zonal  and  meridional  displace¬ 
ments  of  tropical  cyclones  at  12-hour  intervals  to  84  hours. 
Eof  coefficients  of  the  dependent  sample  were  used  to  repre¬ 
sent  the  synoptic  forcing  in  the  equations.  Forecast  errors 
were  competetive  with  other  statistical  methods.  The 
average  vector  displacement  error  for  an  independent  sample 
was  approximately  17  percent  smaller  than  the  long-term 
average  official  JTWC  forecasts.  The  best  overall  forecasts 
were  obtained  using  equations  derived  with  500  mb  height 
data.  For  these  equations,  the  vector  displacement  forecast 
errors  obtained  for  the  independent  sample  were 
164  km  (88  n  mi),  333  km  (176  n  mi)  and  513  km  (277  n  mi) 
for  24-,  48-  and  72-hour  forecasts,  respectively.  It  is 
noted  that  a  shortcoming  of  Shaffer  (1982)  was  the  smallness 
of  the  independent  sample,  but  th«  study  demonstrated  that 
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cumber  of  grid-point  predictors  would  be  prohibitive.  The 
difficulties  inherent  in  both  statistical  and  dynamical 
methods  motivated  Shaffer  (1982)  and  Shaffer  and  Elsberry 
(1982)  to  develop  a  statistical-climatological  tropical 
cyclone  track  prediction  technique  using  an  EOF  representa¬ 
tion  of  the  synoptic  forcing.  The  EOFs  provided  an  alterna¬ 
tive  to  grid-point  predictors.  The  technique  enabled  the 
representation  of  fields  of  120  grid  points  by  10  eigenvec¬ 
tors  and  their  associated  EOF  coefficients.  Eighty-five 
percent  of  the  total  variance  of  the  data  was  accounted  for 
by  these  10  modes.  Shapiro  and  Neumann  (1984)  also  used  10 
modes  to  account  for  98  percent  of  the  total  variance  in 
geopotential  height  data  in  either  a  rotated  or 
geographically-oriented  grid  system.  These  advantages  of 
data  reduction  and  simple  numerical  representation  of 
synoptic  fields  make  the  EOF  technique  ideal  to  use  with 
regression  analysis.  The  eigenvectors  represented  different 
patterns  relating  to  tropical  cyclone  movement;  that  is, 
patterns  which  appeared  to  be  physically  important  in  the 
determination  of  tropical  cyclone  movement.  This  approach 
was  novel  for  forecasting  of  storm  movement  in  the  sense 
that  previous  regression  analysis  methods  (e.  g.,  Neumann 
and  Lawrence,  1973)  had  not  incorporated  the  entire  synoptic 
forcing  field. 

That  the  synoptic  flow  surrounding  a  tropical  cyclone  is 
a  major  determinant  of  cyclone  movement  has  been  long 
observed  (Chan  and  Gray,  1982)  .  In  particular,  it  has  been 
well  established  that  tropical  cyclone  movement  is  signifi¬ 
cantly  related  to  mid-tropospheric  surrounding  wind  patterns 
(Chan  et  al. ,  1980).  Neumann  and  Lawrence  (1975)  associated 
most  of  the  variance  reduction  by  statistical  models  for 
prediction  of  tropical  cyclone  movement  with  input  from 


(1)  climatological  and  analog  techniques;  (2)  extrapolation; 
(3)  steering  technigues;  (4)  dynamic  models;  and  (5)  empir¬ 
ical  and  analytical  technigues.  A  brief  description  of  the 
objective  technigues  used  is  given  in  the  annual  report 
(Joint  Typhoon  Warning  Center,  1933).  In  contrast  to  the 
NHC,  the  JTWC  has  not  placed  emphasis  on  the  development  of 
statistical  methods.  The  variety  and  range  of  sophistica¬ 
tion  of  techniques  in  operational  use  at  tne  NHC  and  the 
JTWC  for  objective  forecasting  of  tropical  cyclone  movement 
is  noteworthy.  That  simple  methods  such  as  mere  extrapola¬ 
tion  are  competetive  with  complicated  numerical  models  might 
he  taken  as  a  surprising  indication  that  little  progress  has 
been  cade  in  the  improvement  of  forecast  skill. 
Alternately,  the  indication  could  be  that  tropical  cyclones 
are  not  predictable  solely  by  use  of  a  single  class  of 
methods.  For  the  years  1972-1983,  tne  magnitude  of  the 
track  forecast  error  by  the  JTWC  for  western  North  Pacific 
tropical  cyclones  was  approximately  213  km  (115  n  mi), 
407  km  (220  n  mi)  and  667  km  (360  n  mi)  for  the 
24-,  48-  and  72-hour,  respectively  (Joint  Typhoon  Warning 
Center,  1983).  Improvement  over  these  forecast  errors  is 
seen  to  be  a  realistic  goal. 

B.  OBJECTIVES 

The  main  objective  of  this  study  is  to  develop  a 
''statistical-climatological'1  method  to  forecast  tropical 
cyclone  movement.  However,  computational  requirements  for 
the  development  of  a  regression  model  from  a  large  synoptic 
grid  system  limits  the  number  of  possible  grid-point 
predictors.  An  Empirical  Orthogonal  Function  (EOF)  approach 
similar  to  that  used  by  Shaffer  and  Elsberry  (1982)  is 
therefore  adopted.  If  an  attempt  were  made  to  develop  a 
regression  model  using  a  large  synoptic  grid  system,  the 


where  v  is  the  Lagrange  multiplier,  I  the  identity  matrix 
and  o  the  null  vector.  Nontrivial  solution  of  (3.4) 
requires  v  to  satisfy: 

1 S  -  vlj  =  0  .  (3.5) 

The  values  of  v  are  thus  the  eigenvalues  of  the  correlation 
matrix  E ,  and  e  is  the  associated  (normalized)  eigenvector. 
Premultiplication  of  (3.4)  by  e'  and  application  of  the 
constraint  e’e  =  1  from  (3.3)  gives: 

v  =  e ' Ee  (3.6) 

Since  v  was  chosen  to  maximize  this  correlation,  v  must  be 
the  largest  eigenvalue  of  R.  Morrison  (1967)  extends  this 
constrained  maximum  method  to  show  that  the  m  eigenvalues  of 
E  account  for  the  variance  in  each  of  the  m  dimensions.  In 
the  following  discussion,  the  eigenvalues  are  ordered  such 
that ; 


v,  >  >  vm 


(3.7) 


Also,  the  importance  of  the  ith  eigenvalue  is  measured  by 

$  >i  4 

Lfc  =  f  V.  /  2  v.  =  v.  /  trR  ,  (3.8) 

where  is  the  fraction  of  the  total  variation  in  R 

accounted  for  by  the  eigenvectors  associated  with  the  k 
largest  eigenvalues.  The  trace  of  the  correlation  matrix 
(tr  R)  is  equal  to  its  order  (m). 

Any  of  the  input  data  cases  (stored  in  a  particular 
column  of  A)  is  "reproduceatle"  by  application  of  the  EOF 
coefficients  defined  by: 
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c  =  E» A  ,  (3.S) 

where  C  is  an  m  x  n  orthogonal  matrix  and  E  is  the  m  x  n 
orthonor mal  matrix  of  the  eigenvectors.  .datrix  E  is  formed 
such  that  column  j  holds  the  normalized  eigenvector  associ¬ 
ated  with  eigenvalue  j.  Since  E  is  orthonormal,  (3.9)  gives 
directly  that  the  data  matrix  A  can  be  recreated  as: 

A  =  EC  .  (3.  10) 

Thus,  The  EOF  analysis  results  in  a  factorization  (3.10)  of 
the  data  matrix  A.  Matrix  E  of  eigenvectors  represents  the 
spatial  decomposition  of  the  data  variance  into  orthogonal 
modes.  The  coefficient  matrix  C  accounts  for  the  temporal 
variance . 

The  replication  of  the  data  matrix  A  is  exact.  The 
potential  for  application  of  the  analysis  with  independent 
data  is  discussed  in  Chapter  17.  It  is  noted  that  exact 
reproduction  is  not  possible  for  cases  not  in  the  develop¬ 
mental  set  of  cases.  Such  a  recreation  would  not  be 
possible  using  a  finite  sum  of  functions  of  an  orthogonal 
family.  Case  j  (stored  in  column  j  of  matrix  A)  is  repre¬ 
sented  by  a  linear  combination  of  the  orthogonal  coeffi¬ 
cients  and  eigenvectors: 

•m 

a(j)  =  ^  c(i,j)  *e(i)  for  j  =  1,  ...  ,  n  ,  (3.  11) 


where  a  (j)  is  the  column  vector  j  of  matrix  A,  the  c(i,j) 
are  elements  of  the  coefficient  matrix  C  and  e  (±)  is  the 
eigenvector  in  column  i  of  matrix  E. 

A  word  of  caution  should  be  given  here.  The  factoriza¬ 
tion  (3.10)  is  unique  up  to  the  coefficient  signs  since  the 
coefficients  are  computed  such  that  the  variance  is  parti¬ 
tioned  orthogonally  into  successively  smaller  portions. 


This  uniqueness  results  since  the  partitions  formed  are 
distinct.  The  researcher  might  be  tempted  to  use  orthogonal 
(matrix)  transformations  on  the  coefficient  matrix  in  an 
attempt  to  simplify  the  interpretation  of  the  subject 
matter.  The  transformed  matrix  will  generate  the  original 
data  just  as  exactly  as  before;  however,  the  eigenvectors  no 
longer  represent  the  same  maximum  percentages  of  variance. 

It  is  generally  found  that  an  adequately  large 
percentage  of  the  total  variation  in  R  (and  hence  in  A)  can 
be  attributed  to  the  first  p  eigenvectors  such  that  p  is 
much  smaller  than  the  total  cumber  of  eigenvectors  (m)  , 
particularly  when  m  is  large  (fiorrison,  1967).  Case  j  is 
then  approximated  by: 


(j)  =  Z.  C(i,  j)  «e  (i)  for  j  =  1,  ...  ,71  . 

<>  1 


(3.  12) 


It  is  possible  to  recreate  the  input  data  elements  of  matrix 
A  from  the  standardized  matrix  Z.  If  the  first  p  eigenvec¬ 
tors  are  retained,  then  (3. 11) is  approximated  by: 

I f 

a(i,j)  =  £  [  c  (k,  j)  *e  (i,k)  3  s  (i )  +  b  (i)  ,  (3.13) 

k-l 

where  the  e(i,k)  are  elements  of  the  eigenvector  matrix  S. 

Shaffer  (1982)  discussed  the  rotation  of  eigenvectors 
computed  in  an  EOF  analysis.  He  gave  a  very  simple 
example  cf  rotation  and  contrasted  orthogonal  rotation  with 
oblique.  The  possibility  that  unrotated  eigenvectors  may  not 
represent  the  true  synoptic  patterns  was  also  explored  and 
evidence  given  that  this  should  not  occur  for  true 
geophysical  synoptic  fields.  Rotation  was  not  performed  on 
the  eigenvectors  in  this  study  for  several  reasons.  First, 
the  eigenvectors  were  needed  to  generate  the  coefficients  to 
be  used  for  the  regression  analysis.  As  such,  the  ability 
to  interpret  physically  the  eigenvectors  is  not  as  important 


as  in  studies  in  whicn  this  is  a  major  objective.  For 
example,  Legler’s  (1983)  study  of  the  tropical  Pacific 
trades  showed  that  rotation  of  the  resultant  eigenvectors 
can  be  essential  to  interpreting  the  patterns  as  well  as  to 
simplifying  the  statistical  analysis  of  the  data.  Second,  a 
goal  of  this  study  was  to  reduce  the  data  required  for  fore¬ 
casting.  This  was  done  by  analyzing  the  amount  of  variance 
accounted  for  by  the  various  eigenvectors.  Were  the  eigen¬ 
vectors  to  have  been  rotated,  they  would  no  longer  account 
for  the  same  percentage  of  the  total  variance.  For  further 
discussion  cn  the  rotation  of  eigenvectors,  the  reader  is 
referred  to  Richman  (1981). 

C.  SELECTING  THE  NOMEEE  OF  EIGENVECTORS 

One  important  advantage  of  the  EOF  technigue  is  that  of 
summarizing  most  of  the  variation  in  a  multivariate  system 
in  terms  of  fewer  variables.  Unless  the  system  is  defective 
(less  than  full  rank),  some  variance  will  always  be  unex¬ 
plained  if  fewer  than  m,  the  row  dimension  of  the  data 
matrix  A,  are  taken  to  describe  the  system.  The  problem 
faced  by  the  model  builder  is  to  determine  the  number  of 
eigenvectors  to  provide  a  parsimonious,  yet  fairly  adequate, 
description  of  a  data  system.  Various  methods  have  been 
applied  to  determine  how  many  eigenvectors  are  significant; 
that  is,  possess  maximum  information  with  minimum  noise. 
The  classical  methodology  outlined  by  Morrison  (1967)  is 
based  upon  the  asymptotic  behavior  of  the  eigenvalues.  This 
approach  operates  on  the  assumption  of  a  large  sample  of 
normal  data.  If  standardized  data  are  utilized,  the 
sampling  statistics  are  considerably  more  complex  (Anderson, 
1963).  Preis en dorf er  and  Barnett  £1977)  observed  that  this 
method  is  generally  not  suited  to  geophysical  studies  in 
which  sample  sizes  are  too  small  to  have  the  requisite 


asymptotic  behavior.  Shaffer  (1982)  found  the  asymptotic 
assumption  to  be  invalid  for  his  study  of  504  cases  (with 
120  data  points  each)  of  geopotential  heights.  Another 
alternative  is  to  use  the  LEV  (Logarithmic  Eigenvalue) 
diagram  method  (Einne  and  Karhila,  1979)  which  identifies 
those  structural  differences  of  the  eigenvectors  that 
describe  noise  instead  of  signal.  Although  this  method  is 
simple,  it  is  unsatisfactory  because  of  the  subjectivity 
required  on  the  part  of  the  researcher  and  the  lack  of  a 
strong  theoretical  basis.  Other  methods  such  as  those  of 
Richman  (1980)  or  Brown  (198  1)  are  also  rejected  because 
they  are  too  subjective  in  their  applications.  Methods  such 
as  presented  by  Cattell  (1953)  and  Guttman  (1954)  are 
considered  unsuitable  because  of  the  danger  of  probable 
over  factoring  and  their  lack  of  a  scientific  basis. 

The  method  used  in  this  study  is  a  Monte  Carlo  approach 
(Preisenaorf er  and  Barnett,  1977).  This  approach  was  chosen 
over  Morrison’s  (1967)  because;  (1)  it  does  not  require 
asymptotic  behavior  of  the  eigenvalues;  (2)  it  is  objective; 
and  (3)  it  is  based  on  statistical  methodology.  The  first 
step  in  this  method  is  to  generate  at  random  a  large  number 
(at  least  100)  of  data  fields  consisting  of  standard  normal 
deviates,  which  are  then  assembled  into  a  matrix  Z.  Matrix 
Z  is  therefore  assumed  to  represent  a  data  matrix  obtainable 
if  all  processes  are  purely  random.  Next,  the  eigenvalues 
are  computed  for  each  matrix  Z.  Means  and  standard  devia¬ 
tions  are  determined  for  the  simulated  eigenvalues.  The 
eigenvalues  obtained  from  the  pnysical  data  are  compared 
with  those  from  the  simulated  deviates.  If  the  true  eigen¬ 
value  deviates  from  the  mean  cf  the  corresponding  random 
data  eigenvalues  by  more  than  two  (three)  standard  devia¬ 
tions,  then  the  true  eigenvalue  is  significant  at  the  95 
percent  (98  percent)  confidence  level  (Preisendorf  er  and 
Barnett,  1977).  That  is,  deviation  of  the  true  eigenvalue 


from  the  mean  simulated  eigenvalue  by  at  least  two  standard 
deviations  is  indicative  that  the  associated  eigenvector 
represents  signal  rather  than  ncise.  As  successive  coeffi¬ 
cients  are  computed,  a  running  sum  (using  (3.12)  or  (3.13) 
as  appropriate)  can  be  formed  and  compared  with  data  matrix 
A  to  determine  how  well  the  data  matrix  is  being  generated 
by  a  smaller  number  of  modes. 


17.  RESOLTANT  EMPIRICAL  ORTHOGONAL  FUNCTIONS 


A.  STATISTICAL  ANALYSIS 

The  mathematical  and  theoretical  framework  developed  in 
Chapter  III  was  used  for  a  scalar  EOF  analysis  of  the  depen¬ 
dent  data  set  (682  cases  as  described  in  Chapter  II).  The 
major  purpose  of  this  phase  of  the  data  analysis  was  to 
compute  the  EOF  coefficients  needed  for  the  tropical  cyclone 
motion  forecast  scheme  proposed  in  Chapter  I.  Since  these 
EOF  coefficients  were  needed  for  use  as  possible  predictors 
in  separate  regression  equations  for  zonal  and  meridional 
storm  movement,  a  scalar  rather  than  vector  representation 
was  considered  to  be  adequate.  For  each  of  the  zonal  and 
meridional  wind  fields  at  700  mb,  400  mb  and  250  mb,  a 
527  x  632  data  matrix  A  was  formed  using  the  interpolated 
fields  as  columns.  A  matrix  Z  of  standardized  data  was 
computed  for  each  matrix  A,  and  the  resultant  eigenvalues 
and  corresponding  eigenvectors  were  determined.  For  each 
wind-component  field,  527  modes  (eigenvectors)  were  gener¬ 
ated.  The  EOF  coefficients  for  each  of  the  682  cases  were 
also  computed  for  each  of  the  six  wind-component  fields. 

The  eigenvalues  and  cumulative  percentage  of  total  vari¬ 
ance  for  the  zonal  and  meridional  fields  are  presented  by 
pressure  level  in  Tables  II  through  IV.  The  eigenvalues, 
and  hence  the  significance  cf  their  associated  modes, 
decrease  rapidly  with  increasing  mode  number.  Zonal-field 
eigenvalues  decrease  at  approximately  twice  the  rate  of 
decrease  of  the  meridional  eigenvalues. 

Although  many  modes  resulted  because  of  the  number  of 
grid  points  per  case,  most  of  the  higher  order  modes  repre¬ 
sent  noise  rather  than  signal.  To  determine  the  number  of 
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was  run  as 


modes  to  be  retained,  a  Monte  Carlo  simulation 
described  in  Chapter  III.  A  random  number  generator  ior 
standard  normal  deviates  was  used  to  simulate  100  standard¬ 
ized  527  x  682  data  matrices  Z.  The  statistical  "structure" 
of  these  random  fields  parallels  that  of  the  standardized 
fields  of  the  real  data.  For  each  of  the  100  simulated 
matrices  of  682  cases  of  random  standard  scores,  the  EOF 
analysis  was  performed  to  yield  100  sets  of  527  eigenvalues 
(one  per  field  grid  point).  The  means  and  standard  devia¬ 
tions  of  the  Monte  Carlo  eigenvalues  were  computed.  If  the 
eigenvalue  for  a  mode  computed  from  the  real  data  was 
greater  than  the  corresponding  mean  eigenvalue  plus  twice 
its  standard  deviation  as  computed  from  the  random  data, 
then  the  eigenvalue  and  eigenvector  from  the  real  data  were 
selected  as  representing  atmospheric  signal.  The  corre¬ 
sponding  mode  was  retained  at  the  95  percent  confidence 
level.  Table  V  contains  the  mean  eigenvalues  of  the  Monte 
Carlo  simulation  as  well  as  these  mean  eigenvalues  plus 
twice  their  standard  deviation. 

Comparisons  of  the  six  sets  of  real-field  eigenvalues  to 
those  of  the  random  fields  are  performed  separately  since 
the  number  of  significant  eigenvectors  may  be  different  for 
each  level.  The  only  relationship  between  the  modes  of  the 
six  fields  for  the  three  levels  comes  from  any  vertical 
coupling  that  may  exist.  Fig.  7  illustrates  the  eigenvalues 
for  the  700  mb  zonal  wind  field  and  the  Monte  Carlo  simula¬ 
tion  for  the  first  40  modes.  Twenty-four  modes  are  indi¬ 
cated  to  represent  signal.  Table  VI  is  a  summary  of  the 
number  of  modes  to  be  retained  and  the  percentages  of  total 
variance  described  according  tc  the  Monte  Carlo  selection 
criterion.  Some  general  observations  can  be  made.  For 
either  the  zonal  or  the  meridional  flow,  the  number  of  modes 
that  represent  signal  at  700  mb  is  less  than  that  at  400  mb, 
which  in  turn  is  less  than  that  at  250  mb.  For  any  of  the 
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levels  analyzed,  a  smaller  number  of  zonal  modes  than  meri¬ 
dional  are  retained  with  a  higher  percentage  of  total  vari¬ 
ance  represented. 

B.  INTERPRETATION  OP  RESULTS 

The  percentages  of  variance  unexplained  (noise)  is  real¬ 
istic  for  a  tropical  wind  analysis.  Errors  are  largely  due 
to  data  distributions  or  measurement  errors.  The  analysis 
problem  is  difficult  because  of  the  weak  governing  mass-wind 
balance  relationship  in  the  tropics  (Haitiner  and  Williams, 
1980).  Therefore,  it  is  plausible  that  the  level  of  random 
error  in  the  wind-ccmponen t  fields  is  as  high  as  18.3 
percent.  This  maximum  percentage  of  "noise"  (for  the  meri¬ 
dional  wind  fields  at  700  mb)  corresponds  to  the  largest 
number  of  modes  (35)  selected  tc  represent  "signal". 

In  the  subsequent  regression  analysis,  only  the  first  35 
modes  of  the  zonal  and  meridional  wind  fields  will  be  used 
in  the  development  of  the  corresponding  zonal  and  meridional 
storm  movement  equations  for  each  of  the  three  pressure 
levels  analyzed.  The  retention  of  35  modes  for  each  wind 
component  field  provided  the  maximum  possible  selection  of 
modes  without  including  unnecessary  noise.  Using  only  35 
coefficients  out  of  527  is  a  remarkable  data  reduction  of  93 
percent.  For  each  field  it  is  necessary  to  store  only  the 
eigenvector  matrix  E  and  the  first  35  coefficients  for  each 
case,  which  will  account  for  no  less  than  81.7  percent  of 
the  total  variance.  Table  VII  lists  the  percentages  of 
variance  accounted  for  when  35  modes  are  retained  for  all 
wind  component  fields.  At  least  90  percent  of  the  total 
variance  of  any  zonal  wind  field  is  accounted  for.  While 
the  number  of  EOF  coefficients  needed  is  much  larger  than 
the  10  per  case  in  Shaffer  (  1  382),  35  modes  per  field  is 
still  a  tractable  number  of  potential  predictors  for  regres¬ 
sion  analysis. 
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It  is  beneficial  to  investigate  the  physical  signifi¬ 
cance  of  the  modes  determined  tc  represent  signal.  Shaffer 
(1932)  found  that  the  broad-scale  features  of  eigenvectors 
derived  from  geopotential  height  fields  had  meteorological 
meaning.  Contours  of  inodes  one  and  two  (multiplied  by  100) 
for  the  700  mb  zonal  and  meridional  fields  are  presented  in 
Figs.  8  and  9.  The  eigenvectors  are  non-dimensional,  since 
standardized  data  were  used  for  the  EOF  analysis.  Two 
points  must  be  stressed.  First,  there  is  no  mathematical 
connection  between  any  zonal-field  mode  and  the  same  mode  of 
the  meridional  field.  That  is,  it  is  not  possible  to  regain 
the  vector  nature  of  the  wind  by  a  combination  of  zonal  and 
meridional  eigenvectors.  Secord,  each  eigenvector  repre¬ 
sents  the  pattern  shown  as  well  as  the  exact  inverse  of  the 
pattern.  For  a  given  field,  the  forcing  pattern  of  a  parti¬ 
cular  eigenvector  is  dependent  upon  the  sign  of  the  associ¬ 
ated  EOF  coefficient.  If  the  coefficient  is  negative,  then 
the  forcing  pattern  of  the  eigenvector  is  "inverted". 
Positive  (negative)  components  of  the  field  are  reversed  to 
negative  (positive).  The  following  discussion  will  use 
eigenvector  patterns  as  shown  without  considering  the 
inverse  patterns. 

The  patterns  of  the  700  mb  irodes  1  and  2  in  Figs.  8  and 
9  can  be  interpreted  separately  as  possible  atmospheric  flow 
patterns.  Mode  1  of  the  700  mb  zonal  flow  (Fig.  8)  shows  a 
cyclonic  shear  across  the  cyloce,  with  easterlies  to  the 
north  of  the  cyclone  and  westerlies  to  the  south.  Mode  2  of 
the  700  mb  zonal  flow  (Fig. 8)  is  dominated  by  broad  easterly 
flow.  The  zonal  modes  1  and  2  at  400  mb  and  250  mb 
(not  shown)  are  characterized  by  diminished  equatorial 
westerlies  to  the  south  of  the  cyclone.  Modes  1  and  2  of 
the  700  mb  meridional  flow  (Fig.  9)  both  show  alternating 
bands  of  positive  and  negative  flow.  These  patterns  are 
typical  for  trough-ridge-trough  arrangements.  Speed  maxima 
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ir.  the  bands  are  located  north  cf  the  cyclone.  The  cyclone 
is  again  located  in  a  region  of  cyclonic  shear.  Meridional 
modes  1  and  2  at  400  mb  and  250  mb  (not  shown)  depict  the 
eastward  slope  of  the  700  mb  patterns  with  elevation.  These 
modes,  which  individually  account  for  the  largest  percent¬ 
ages  of  total  variance  in  their  cor  responding  fields,  are 
indeed  patterns  or  signals  that  appear  to  relate  to  tropical 
cyclone  movement. 

Complexity  of  the  eigenvectors  made  it  difficult  to 
associate  observable  atmospneric  patterns  with  higher  order 
modes  for  any  of  the  fields.  Legler  (1983)  has  observed 
that  examination  of  the  eigenvectors  to  give  appropriate 
physical  interpretations  may  be  impractical  for  data 
collected  over  large  grids.  Over  large  areas,  signals  from 
two  or  more  physical  processes  may  be  overlaid  in  a  single 
eigenvector.  This  can  occur  since  there  are  no  restrictions 
as  to  how  the  patterns  for  a  particular  process  may  be 
"decomposed"  among  the  eigenvectors.  Moreover,  particularly 
strong  atmospheric  signals  may  appear  in  more  than  one 
eigenvector.  Under  such  circu instances  any  realistic  inter¬ 
pretation  of  the  modes  may  be  precluded. 

It  is  also  important  to  verify  that  the  significant 
modes  selected  for  retention  do  satisfactorily  represent  the 
data  fields.  A  case  (0000  GMT  30  July  79)  was  selected  at 
random  to  demonstrate  the  rec cnstr uction  capability  of  an 
EOF  analysis.  At  this  time,  Typhoon  Hope  was  at  approxi¬ 
mately  16.9  N,  133.4  E  with  maximum  sustained  winds  of 
38.6  m/s  (75  kts)  The  actual  zonal  wind  field  at  700  mb  and 
the  reproduction  by  summing  all  527  modes  are  shown  in 
Fig.  10.  The  reproduction  of  the  original  field  is  seen  to 
be  exact.  If  the  eigenvector  matrix  were  to  be  used  to 
generate  coefficients  for  a  case  not  included  in  the  depen¬ 
dent  sample,  the  reproduction  produced  by  summing  over  all 
modes  would  not  be  exact.  The  fields  obtained  by  summing 
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the  nrst  5,  15,  25  and  35  are  shown  m  Figs.  11  ana  12. 
When  only  five  nodes  are  summed,  only  the  gross  patterns 
(positive  flow  versus  negative)  are  reproduced.  Yet,  it  is 
interesting  to  observe  how  only  five  coefficients  and  modes 
can  begin  to  recreate  a  particular  field  using  eigenvectors 
derived  from  all  682  cases.  As  the  number  of  modes  is 
increased,  an  increasing  amount  of  the  complexity  of  the 
original  field  is  replicated  (Figs.  11  and  12).  In  the  next 
chapter,  the  EOF  coefficients  derived  for  the  zonal  and 
meridional  wind  fields  will  be  used  as  potential  predictors 
representing  the  synoptic- sea le  forcing  in  a  stepwise 
regression  procedure. 


A.  MOTIVATION 


V.  REGRESSION  ANALYSIS 


Regression  analysis  is  one  of  the  most  widely  used 
statistical  tools.  Its  essence  is  the  study  of  relation¬ 
ships  among  variables  to  serve  three  major  purposes: 
description,  control  and  prediction.  Tae  researcher's  goal 
is  to  find  a  simple  mathematical  model  that,  on  the  basis  of 
observed  data,  will  fit  a  complex  phenomenon.  An  excellent 
presentation  of  theory  and  method  that  is  conducive  to  prac¬ 
tical  application  is  given  by  Neter  and  Wasserman  (1S74).  A 
more  advanced  presentation  of  statistical  theory  of  the 
complete  general  linear  model  is  given  by  Graybill  (1  976). 
Briefly,  regression  analysis  involves  using  a  linear  combi¬ 
nation  of  known  quantities  (predictors)  to  estimate  the 
value  of  an  unknown  quantity  ( p redictand)  . 

EOF  coefficients  have  been  demonstrated  to  give  a 
convenient,  quantitative  representation  of  physical  forcing 
mechanisms  acting  on  tropical  cyclones  (Chapter  IV)  . 
Previous  studies  (described  in  Chapter  I)  have  shown  that 
statistical  forecast  schemes  based  on  regression  equations 
are  viable  methods.  In  particular,  it  is  possible  to  use 
EOF  coefficients  based  on  geopotential  heights  as  predictors 
to  forecast  tropical  cyclone  movement  (Shaffer  and  Elsberry, 
1982;  Shapiro  and  Neumann,  1  984).  The  hypothesis  here  is 
that  the  EOF  coefficients  derived  to  represent  wind  forcing 
of  a  tropical  cyclone  would  ne  useful  predictors  of  future 
storm  movement. 

Western  North  Pacific  tropical  cyclone  position  forecast 
errors  for  10  years  (1966-  1975)  have  been  statistically 
analyzed  (Jarrell  et  al. ,  1  978)  .  The  examination  of  errors 
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revealed  that  a  small  number  of  readily  available  parameters 
can  classify,  with  reasonable  effectiveness,  a  tropica., 
cyclone  forecast  as  representing  a  group  af  storms  with 
either  markedly  above  or  below  average  errors.  These  vari¬ 
ables  include  storm  location,  maximum  sustained  wind  and  the 
components  of  motion.  Thus,  it  is  hypothesized  that  these 
parameters  might  also  be  appropriate  predictors  of  tropical 
cyclone  movement.  Eegression  analyses  were  performed  to 
investigate  these  hypotheses. 

B.  VARIABLE  AND  CASE  SELECTION 

A  primary  goal  of  any  regression  analysis  is  to  choose  a 
set  of  independent  variables  that  is  "best".  Here  the 
criterion  "best"  is  defined  as  minimizing  the  sum  of  sguares 
of  residuals  without  overfitting.  Practicality  requires 
that  there  be  a  scope  of  the  model;  that  is,  the  coverage  of 
a  model  is  restricted  to  some  region  or  interval  of  values 
of  the  independent  variables.  Model  coverage  is  determined 
by  the  dependent  cases  included  in  the  analysis.  Possible 
difficulties  are  considered  later  in  this  chapter. 

Predictands  for  this  study  are  the  average  24-,  48-  and 
72-hour  zonal  ana  meridional  translation  speeds  of  the  trop¬ 
ical  cyclone.  These  average  speeds  were  determined  from  the 
case-time  JTWC  warning  position  and  the  subsequent  JTSC 
warning  position  at  24,  43  or  72  hours.  Positive  motion  was 
defined  to  the  north  and  to  the  west,  since  the  majority  of 
tropical  cyclones  tracked  to  the  north  and  west.  As  there 
are  six  predictands,  six  regression  equations  are  required 
for  each  of  the  pressure  levels  included  in  the  study 
(700  mb,  400  mb  and  250  mb)  .  A  total  of  18  equations  was 
der  i  v  ed . 

It  is  emphasized  that  the  predictands  were  computed 
using  JTWC  warning  positions  at  both  base  time  and  the 
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VII.  CONCLUSIONS  AND  SUGGESTED  BESEARCH 


The  results  described  in  this  thesis  must  be  reg.-ded  as 
preliminary.  However,  it  appears  sufficiently  promising 
that  a  viable,  efficient  regression  scheme  involving  EOF 
coefficients  to  represent  wind  forcing  can  be  developed. 
Two  improvements  are  suggested  before  any  operational 
testing  might  be  performed.  First,  the  predictanas  should 
be  computed  using  the  JTWC  warning  position  at  base  time  and 
the  JTWC  best-track  positions  at  the  predictand  times.  The 
best-track  positions  are  based  on  a  post-season  analysis 
using  all  information  available.  The  use  of  warning  posi¬ 
tions  for  the  locations  of  the  cyclone  at  predictand  times 
unnecessarily  contaminates  the  predictands.  It  is  appro¬ 
priate  to  use  the  warning  position  to  locate  the  tropical 
cyclone  at  base  time  because  this  is  the  only  position 
available  at  the  time  of  the  forecast.  Second,  forecast 
error  should  correspondingly  be  defined  as  the  deviation  of 
the  forecast  position  from  the  best-track  position. 

Adoption  of  an  operational  forecast  model  requires 
testing  using  both  dependent  and  independent  data.  The 
EGF-regression  forecast  errors  should  be  compared  with  fore¬ 
casts  obtained  by  another  operational  model  (such  as  CLIPEF.) 
and  of  the  JTWC.  The  ultimate  utility  of  the  model  depends 
upon  demonstrated  forecast  skill  for  operational  data, 
regardless  of  prior  performance  on  dependent  data  or  indica¬ 
tions  of  a  statistical  significance  test.  Results  obtained 
in  the  present  study  indicate  very  good  potential  for  an 
operational  model. 
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movement  forecast  could  be  generated  upon  input  of  the 
appropriate  zonal  and  meridional  components  at  the  527 
points.  Operational  implementation  of  such  a  statistical- 
climatological  method  appears  to  be  feasible. 
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Assuming  that  the  eigenvector  matrix  2  was  determined 
from  an  adeguate  (large)  dependent  data  set,  the  same  set  of 
eigenvectors  can  be  used  indefinitely  for  independent 
cases  (new  tropical  cyclones),  within  the  limitations  of  the 
scope  of  the  model.  Shaffer  (1982)  recommended  that  the 
regression  equations  te  updated  at  the  conclusion  of  each 
typhoon  season.  The  feasibility  and  necessity  of  updating 
can  be  questioned  for  the  current  model.  Shaffer's  cases 
required  120  data  points  per  case  as  opposed  to  two  fields 
of  527  data  points  each  for  this  study.  If  each  case 
meeting  selection  requirements  were  added  to  the  dependent 
data  set,  computing  difficulties  would  be  likely  to  become 
prohibitive  after  several  tropical  cyclone  seasons.  While 
it  "might"  be  advantageous  at  least  to  include  the  anomalous 
cases,  specific  inclusion  of  anomalous  cases  could  seriously 
reduce  the  ability  of  the  regression  analysis  to  obtain  a 
good  fit.  Shaffer  (1982)  also  suggested  that  increasing  the 
number  of  dependent  data  cases  should  result  in  fewer  large 
forecast  errors.  However,  a  larger  dependent  data  set  does 
not  imply  a  better  fit  (as  measured  by  R2)  ,  nor  does  it 
imply  that  the  model  will  better  forecast  anomalous  cases. 
One  alternative  would  be  to  have  more  than  one  set  of 
regression  equations.  A  map-typing  or  analog  technique 
could  be  used  to  determine  which  set  of  equations  would  be 
appropriate  on  a  case-by-case  basis.  Such  an  alternate 
method  would  lack  simplicity,  which  is  one  of  the  most 
attractive  features  of  the  current  20F-based  regression 
forecast  scheme. 

The  forecast  scheme  using  ECF-based  regression  models  is 
very  simple  compared  with  other  more  elaborate  models.  The 
model  requires  only  a  set  of  coefficients  representing  the 
synoptic-scale  wind  forcing  and  predictors  representing 
present  position  and  past  storm  movement.  The  entire  fore¬ 
cast  scheme  could  be  executed  using  a  minicomputer.  The 
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for  the  new  case.  If  n  is  large,  the  term  (1/n  +  1)  in  (6.1) 
is  negligible  relative  to  the  first  term  so  that  the 
following  approximation  is  valid: 


R  (new)  E  (old) 


(6.2) 


The  eigenvalues  and  eigenvectors  computed  for  the  depen¬ 
dent  data  using  R (old)  should  be  almost  identical  to  those 
obtained  from  computation  using  R (new) .  Provided  that  a 
sufficiently  large  dependent  sample  is  available,  it  is 
reasonable  to  use  R  (old)  to  compute  tne  EOF  coefficients  for 
a  new  data  case  and  then  to  use  these  coefficients  as 
predictors  in  the  forecast  equations  derived  with  the  depen¬ 
dent  data.  Shaffer  (1982)  determined  that  use  of  the  coef¬ 
ficients  for  cases  calculated  using  R  (old)  introduced  very 
little  error  into  the  movement  forecast.  Testing  is 
required  to  determine  a  sample  size  sufficient  for  (6.2)  to 
be  valid.  The  reader  is  referred  to  Snaffer  for  a  detailed 
example  of  methodology  appropriate  to  test  these 
observations. 

Operational  implementation  of  an  EOF  forecast  scheme 
would  be  straightforward.  Two  major  operations  are 
required.  First,  the  35  required  EOF  coefficients  for  the  — -1 

independent  data  cases  must  be  computed  and  stored.  This  .  ' 

involves  multiplication  of  the  35  x  527  transpose  matrix  of 
truncated  eigenvectors  and  the  527  x  1  vector  of  standard¬ 
ized  observations.  It  is  assumed  that  no  significant  error 
would  be  associated  with  using  the  means  and  standard  devia-  J 

tions  from  the  dependent  sample  at  the  equidistant  grid 
points.  Second,  these  coefficients  and  other  predictors 
would  be  substituted  into  the  regression  equations  to  i 

predict  the  average  zonal  and  meridional  speeds  for  the  ^ 

forecast  interval.  The  predicted  future  location  of  the  ••  • 

tropical  cyclone  could  then  be  determined.  •  -i 


50 


VI.  POTENTIAL  FOR  USE  HITH  INDEPENDENT  DATA 


.  . .  _'j 

Eased  on  results  obtained  using  dependent  data  and 
predictands  derived  using  warning  positions,  EOF-based 
regression  forecasting  appears  to  have  potential  for 
improved  prediction  cf  tropical  cyclone  movement.  The  value 
of  the  final  model  depends  upon  its  potential  for  opera¬ 
tional  use  with  independent  data.  The  regression  equations 
were  derived  using  EOF  coefficients  computed  using  a  parti¬ 
cular  set  of  eigenvectors;  namely,  the  eigenvector  matrix  E 
of  the  dependent  data  set.  These  regression  equations  are 
applicable  only  for  tropical  cyclone  cases  within  the  scope 
of  the  model.  The  scope  of  the  model  is  determined  prima¬ 
rily  by  the  values  of  the  predictors  and  predictands  used  to 
derive  the  forecast  equations.  EOF  coefficients  are  the 
most  sensitive  predictors  in  that  they  are  derived  from  the 

particular  flow  fields  surrounding  the  tropical  cyclones.  - - - 

For  a  new  case,  the  eigenvectors  no  longer  exactly  represent  -  - 

the  maximum  variation  in  all  of  the  observations — dependent  \ 

set  plus  the  new  case.  The  stability  of  the  eigenvectors  ■>' 

v"  *•  *  ' 

must  be  examined  by  determining  whether  the  eigenvectors  and  l.-w 

coefficients  of  the  dependent  data  cases  remain  nearly  the  . * 

same  if  a  new  case  is  added. 

Inclusion  of  an  additional  case  changes  the  correlation 
matrix  E.  The  new  correlation  matrix  can  be  computed  by: 

■  •  i 

R  (new)  =  [  n/  (n+1)  ]*R  (old)  +  [  1/ (n+ 1)  ]«zz»  ,  (6.1) 

where  R  (new)  is  the  new  correlation  matrix  after  addition  of 
the  new  case,  R  (old)  the  original  correlation  matrix  of  the 
dependent  data,  n  the  number  of  cases  prior  to  inclusion  of 
the  new  case,  and  z  the  m  x  1  vector  of  standardized  data 
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Multicoilinearity  exists  unless  the  variables  (including 
the  EOF  coefficients)  are  completely  pairwise  uncorrelated. 
This  rarely  occurs  naturally.  When  the  independent  vari¬ 
ables  are  highly  correlated,  the  predictive  ability  of  the 
model  is  suspect  for  new  cases  whose  independent  variables 
deviate  from  the  pattern  of  multicoilinearity  in  the  depen¬ 
dent  cases. 


errors.  These  results  lo  not  appear  to  agree  with  Jordan 
(1952)  and  Miller  (1958)  who  were  unsuccessful  at  using 
winds  and  heights  at  upper  tropospheric  levels  to  describe 
tropical  cyclone  motion. 

It  was  also  important  to  examine  the  results  for  consis¬ 
tency  in  the  forecasts.  Consistency  would  be  indicated  by 
small  standard  deviations  of  forecast  error.  The  standard 
deviations  were  generally  comparable  to  Shaffer  (1982). 
There  were  no  significant  differences  among  the  standard 
deviations  for  a  given  forecast  interval,  except  the 
standard  deviation  for  the  72-hcur  forecast  using  the  250  mb 
equation  was  particularly  smaller  than  that  for  either 
700  mb  or  400  mb  equations. 

D.  CAUTIONS  FOB  USE  OF  THE  RSGEESSION  MODEL 

Various  restrictions  should  be  considered  when  applying 
the  results  of  a  regression  analysis.  The  validity  of  the 
predictions  depends  upon  whether  basic  causal  conditions  at 
later  times  will  be  similar  to  those  in  effect  for  the  data 
used  for  the  regression  analysis.  The  scope  of  the  data 
must  be  respected  to  avoid  inferences  based  on  an  indepen¬ 
dent  variable  which  falls  outside  the  range  of  input  data. 
Finally,  it  must  be  remembered  that  the  predictands  used  to 
derive  the  equations  were  computed  using  the  JTWC  warning 
positions  at  both  the  base  time  and  at  subsequent  forecast 
times . 

The  performance  of  the  model  as  indicated  by  the  depen¬ 
dent  sample  may  be  superior  to  the  ability  for  new  cases. 
This  is  known  as  prediction  bias,  which  results  when  the 
final  model  chosen  is  too  uniquely  related  to  the  input  data 
cases.  It  is  emphasized  that  the  models  developed  in  this 
study  have  not  been  tested  with  independent  data  cases. 


forecast  interval.  Finally,  the  forecast  error  was  computed 
by  determining  the  magnitude  of  the  vector  between  the  fore¬ 
cast  position  and  the  JTWC  warning  position  at  the  corre¬ 
sponding  time. 

The  forecast  errors  are  summarized  in  Table  XVII  by 
pressure  level  and  forecast  interval.  It  is  stressed  that 
these  results  were  derived  using  only  the  dependent  cases. 
As  expected,  the  forecast  error  increases  with  increasing 
length  of  forecast  interval.  However,  the  magnitudes  of  the 
increases  are  reasonable.  The  increase  in  the  72-hour  fore¬ 
cast  error  over  the  48-hour  forecast  error  was  much  smaller 
than  that  for  Shaffer's  (  1982)  dependent  sample.  The 
smallest  change  for  the  current  study  was  about  82  km  less 
than  for  Shaffer's  results.  It  was  previously  noted  that 
there  was  a  rapid  decrease  in  R2  with  increasing  forecast 
interval  for  Shaffer's  equations.  Shaffer's  equations 
predicted  short-term  movement  well,  but  the  errors  grew 
rapidly  with  increasing  time.  The  24-hour  forecast  error 
for  this  study  was  about  25  k a  larger  than  for  Shaffer's 
dependent  sample.  However,  the  best  48-  and  72-hour  fore¬ 
cast  errors  for  the  current  study  were  28  km  and  90  km  less 
than  those  of  Shaffer.  Stability  of  predictand  variance  for 
the  current  study  resulted  in  models  that  give  promise  of 
improvement  of  long-term  forecasts. 

There  were  no  overwhelming  differences  in  performance  of 
the  equations  derived  for  the  three  levels  at  any  forecast 
interval.  Shaffer's  (1982)  forecast  equations  based  on  an 
EOF  analysis  of  geopotential  height  at  500  mb,  700  mb  and 
850  mb  also  did  not  have  significant  differences  in  errors 
among  the  three  levels.  However,  Shaffer's  500  mb  equations 
outperformed  the  other  two  equation  sets  by  a  wide  margin 
for  a  small  set  of  independent  cases.  Although  the  72-hour 
forecast  errors  in  Table  XVII  are  largest  for  the  250  mb 
equation,  they  still  compare  favorably  with  the  JTWC  mean 


procedure  for  15  of  the  13  equations,  including  all  nine  of 
the  24-hour  forecast  equations.  For  the  zonal  48-,  zonal 
72-  and  meridional  72-hour  forecast  equations,  the  second  or 
third  variable  selected  was  for  past  movement.  The 
predictors  U0LD2 ,  001E3,  V0LD2  and  V0LD3  (see  Table  VIII  for 
description)  were  most  frequently  chosen.  These  results 
were  in  agreement  with  Neumann's  (1978)  observation  that 
statistical  screening  techniques  invariably  select  present 
and  past  storm  movement  over  steering  predictors  derived 
from  the  surrounding  flow  for  short-term  tropical  cyclone 
movement.  However,  the  predictions  are  not  simply  persis¬ 
tence  forecasts.  Mode  variables  CU 1 ,  C02,  CV1  and  CV2  were 
often  the  second,  third  or  fourth  predictors  selected.  This 
was  not  surprising  since  the  first  2  modes  account  for  the 
largest  percentages  of  variance  in  the  wind-component 
fields.  From  2  to  10  zonal  ECF  coefficient  predictors  and 
from  2  to  7  meridional  EOF  coefficient  predictors  were 
chosen  for  the  forecast  eguatiors,  so  that  wind  forcing  was 
also  found  to  be  an  important  determinant  of  tropical 
cyclone  movement. 

Several  potential  predictors  were  not  included  in  any  of 
the  equations:  DATS,  CINT,  VO  LEI  and  DISP1.  The  potential 
predictor  0CLD1  was  retained  in  only  one  forecast  equation. 
These  past  movement  variables  represent  the  interval  from  24 
to  12  hours  prior  to  base  time.  Very  little  information 
would  be  lost  by  exclusion  of  these  potential  predictors. 

The  potential  performance  cf  this  regression  forecast 
scheme  was  evaluated  by  testing  on  the  dependent  data  cases. 
The  following  procedure  was  applied  for  the  forecast  inter¬ 
vals  24,  48  and  72  hours  at  each  pressure  level  (700  mb,  400 
mb  and  250  mb)  .  First,  the  appropriate  equations  were  used 
to  predict  the  average  zonal  and  meridional  speeds  of  the 
tropical  cyclone.  These  speeds  were  converted  to  zonal  and 
meridional  displacements  of  the  tropical  cyclone  during  the 
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same  magnitude  indicates  mean  movement  to  the  nortnwest. 
The  values  of  B 2  for  the  zonal  eiuations  were  significantly 
greater  due  to  the  larger  variatility  in  zonal  movement. 

Values  of  R 2  do  not  vary  greatly  with  forecast  interval 
for  either  the  zonal  or  meridional  equations  at  any  of  the 
pressure  levels.  The  largest  deviations  are  for  tne  700  mb 
zonal  equations  and  the  250  ml  meridional  equations.  In 
contrast,  Shaffer's  (1982)  regression  equations  consistently 
displayed  a  significant  decrease  in  the  value  of  R2  with 
increasing  forecast  interval  (about  0.1  per  12  hour 
interval).  These  differences  in  the  variation  of  R 2  wit., 
forecast  interval  may  account  for  differences  in  forecast 
error  characteristics  discussed  later  m  this  section. 

Finally,  the  accuracy  of  the  zonal  or  meridional  equa¬ 
tions  is  not  a  strong  function  cf  pressure.  For  either  the 
zonal  or  meridional  movement,  the  equation  derived  using  the 
EOF  coefficients  for  a  given  level  does  not  perform  signifi¬ 
cantly  tetter  (as  measured  by  R 2)  than  the  equations  for  the 
other  two  levels.  This  was  similar  to  results  obtained  by 
Shaffer  (1982)  for  the  dependent  sample.  Slightly  larger 
values  of  R2  are  found  for  the  700  mb  zonal  equations  at  all 
three  forecast  intervals. 

Tables  XI  through  XVI  summarize  the  regression  equa¬ 
tions.  The  first  value  in  each  table  is  the  intercept.  The 
average  speed  component  (km/hr)  is  obtained  ty  summing  the 
product  of  all  non-zero  regression  coefficients  and  the 
values  of  the  associated  variables.  Parsimony  in  selection 
of  variables  was  met;  the  main  purposes  of  retention  of  as 
few  variables  as  possible  were  to  obtain  simple  eguations 
and  to  avoid  overfitting. 

Several  observations  were  made  regarding  the  variables 
retained  for  the  regression  eguations  and  the  order  of 
selection.  A  past  movement  variable  (predictors  5-10  in 
Table  VIII)  was  the  first  variable  selected  in  the  stepwise 
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independent  variables  in  tie  re  jresbxor.  aiuel. 
statistic  is  defined: 


R2  =  SSR/SSTCi 


1  -  (  SSE/SSTO  ) 


where  SSTO  is  the  total  sum  of  squares,  SSR  is  the  regres¬ 
sion  sum  of  squares  and  SSE  is  the  residual  sum  of  squares. 
The  5 2  statistic  measures  the  proportion  of  the  total  varia¬ 
tion  in  the  predictand  associated  with  the  use  of  the  inde¬ 
pendent  variables.  The  regression  equations  retained  or.lv 
those  predictors  which  resulted  in  an  increase  ir.  R2  or  at 
least  0.  .01 . 

The  value  of  R2  for  each  regression  equation  is  given  ir. 
Table  IX.  Matching  forecast  times  and  pressure  levels,  the 
value  of  R2  for  a  zonal  equation  is  always  at  least  0.12 
greater  than  the  R2  for  the  meridional  equation  for  the  same 
forecast  interval  and  pressure  level.  Shaffer  (1932)  found 
differences  as  large  as  24  percent.  The  zonal  regression 
equations  account  for  a  greater  portion  of  the  total  zonal 
movement  variation  than  the  neridional  equations.  This 
observation  agrees  with  Shaffer  (1982).  At  least  59  percent 
of  the  total  variation  in  zonal  movement  was  accounted  for 
by  the  equations  at  each  of  the  three  pressure  levels  for 
any  forecast  interval.  Values  of  R2  for  ths  meridional 
equations  range  from  0.325  for  the  250  mb  72-hour  forecast 
to  0.475  for  the  700  mb  24-hour  forecast. 

The  greater  predictive  ability  of  the  zonal  equations 
was  expected.  First,  it  was  shown  in  Chapter  IV  that  fewer 
modes  were  required  to  describe  the  zonal  wind  than  the 
meridional  wind.  Second,  there  is  greater  variation  ir.  the 
zonal  movement  than  in  the  meridional  movement.  The  means 
and  standard  deviations  of  the  average  zonal  and  meridional 
speeds  of  the  various  forecast  intervals  are  given  in  Table 
X.  Positive  mean  zonal  and  meridional  components  with  the 
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cases  for  regression  analysis.  Sample  sizes  were  4C9,  503 
and  232  cases  for  the  24-,  43-  and  72-hour  equations, 
respectively . 

C.  THE  EQUATIONS  AND  ERROR  ANAIYSIS 

A  linear  stepwise  regression  analysis  was  chosen  to 
derive  equations  to  predict  future  average  zonal  and  meri¬ 
dional  speeds  of  the  tropical  cyclones.  Although  an  a 
priori  assumption  of  linearity  could  not  be  made,  the  number 
of  polynomial  predictors  generated  from  a  base  set  of  83 
potential  predictors  would  have  been  intractable.  The  UCLA 
biomedical  computer  program  BMDE2R  (Dixon  and  Brown,  1979) 
was  used  for  the  regressions.  dulticoilinear ity,  which 
occurs  when  some  or  all  of  the  independent  variables  are 
highly  correlated  (Neter  and  Uasserman,  1974)  ,  was  avoided 
by  the  use  of  stepwise  regression.  M ulticollinear ity 
fosters  a  large  potential  for  overfitting  since  many 
different  models  would  provide  the  same  good  fit.  As  a 
result,  it  becomes  impossible  to  interpret  any  one  set  of 
regression  coefficients  as  being  representative  of  the 
effects  of  the  different  independent  variables.  Also,  the 
estimated  regression  coefficients  usually  have  a  very  large 
sampling  variability  so  that  they  are  imprecise  and  lose 
their  meaning  (or  significance) .  The  3HDP  routine  includes 
a  preset  tolerance  to  automatically  screen  the  variables  at 
each  step.  A  potential  predictor  was  not  allowed  to  enter 
the  model  if  it  was  highly  correlated  with  any  predictor 
chosen  in  earlier  steps.  To  ensure  that  a  predictor  was 
significantly  (in  a  statistical  sense)  correlated  with  the 
predictand,  a  minimum  F-to-enter  value  of  4.0  was  imposed 
(Dixon  and  Brown,  1979). 

The  coefficient  of  mult;,.  .  determination  (R2)  is  a 
measure  of  the  association  between  the  dependent  and 


The  recent  motion  of  the  storm  is  an  integral  part  of 
the  prediction  model  of  nearly  all  tropical  cyclone  forecast 
methods.  Most  cyclones  move  with  uniform  direction  and 
speed  (Gray,  1978).  Satisfactory  forecasts  of  tropical 
cyclone  movement  can  be  based  mainly  on  extrapolation  and 
climatology.  Because  there  are  relatively  few  storms  with 
anomalous  tracks,  predictors  based  on  present  and  past  move¬ 
ment  tend  to  dominate  a  statistical  analysis  of  storm 
motion.  These  ” diff icult"  storms,  whicn  are  associated  with 
above-average  forecast  errors,  tend  to  recurve  or  to  move 
erratically  with  nonclimatol ogical  tracks.  A  persistence- 
climatology  forecast  leads  to  large  errors  for  the  20-25 
percent  of  the  cases  of  anomalous  motion  (Gray,  1978)  .  ;jhen 
there  are  not  many  storms  during  a  season,  a  single  anoma¬ 
lous  storm  can  result  in  a  significant  bias  of  the  yearly 
mean  forecast  error  (Neumann,  1981). 

The  remaining  potential  predictors  are  related  to  obser¬ 
vations  of  the  tropical  cyclone  at  base  time.  Tropical 
cyclone  intensity  (potential  predictor  4,  Table  VIII)  was 
the  JTWC  warning  maximum  sustained  wind  speed  at  base  time. 
The  Julian  date  and  the  JTWC  warning  position  latitude  and 
longitude  (potential  predictors  1,  2  and  3,  Table  VIII) 
completed  the  set  of  potential  independent  variables. 

The  682  cases  from  the  EOF  analysis  were  used  to  select 
the  cases  for  the  regression  analysis.  For  a  case  to  be 
included,  a  complete  set  of  potential  predictors  had  to  be 
available.  In  addition  to  availability  of  the  GBA,  the  JTWC 
reports  had  to  be  available  at  12  and  24  hours  prior  to  base 
time  and  at  least  24  hours  subsequent  to  base  time. 
Similarly,  selection  of  that  case  for  development  of  regres¬ 
sion  equations  for  48-  or  72-hour  forecasts  required  that 
JTWC  warning  positions  be  available  at  48  or  72  hours, 
respectively.  These  requirements  decreased  the  number  of 
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forecast  tires.  In  the  subsequent  discussion,  comparisons 
of  the  results  obtained  in  this  study  for  tne  dependent  data 
are  made  with  those  obtained  for  the  dependent  data  in 
Shaffer's  (1932)  study  based  on  an  EOF  analysis  of  geopoten¬ 
tial  height.  Shaffer's  predictar.ds  were  computed  using  the 
JTWC  warning  and  best-track  positions  at  base  and  forecast 
times  respectively. 

Predictors  were  sought  to  assess  quantitatively  the 
effect  of  five  factors  on  tropical  cyclone  movement: 
(1)  external  (to  the  cyclone)  physical  forcing;  (2)  previous 
cyclone  movement;  (3)  cyclone  intensity;  (4)  date;  and  (5) 
initial  (warning)  position.  Table  VIII  describes  the  83 
potential  predictors  used  for  the  regression  analysis  and 
identifies  these  predictors  by  rame  and  number.  The  poten¬ 
tial  predictors  were  identical  for  ail  18  regression  equa¬ 
tions,  except  that  the  regression  equation  for  a  specific 
level  included  only  the  EOF  coefficients  at  that  level. 

Synoptic  external  forcing  on  a  tropical  cyclone  has  been 
conjectured  to  be  an  important  determinant  of  cyclone  move¬ 
ment  (Brown,  1S81;  and  others).  To  incorporate  quantita¬ 
tively  the  wind  forcing,  the  EOF  coefficients  associated 
with  the  first  35  zonal  and  meridional  modes  were  selected 
as  potential  predictors.  These  coefficients  are  potential 
predictors  14  through  83  (C01  through  CU35  and  CV 1  througn 
CY35)  in  Table  VIII.  An  important  objective  of  this  study 
was  to  evaluate  how  well  these  EOF  coefficients  represented 
atmospheric  features  that  affected  cyclone  movement. 

Persistence  has  long  been  known  to  be  a  good  predictor 
of  short-tern  tropical  cyclone  motion.  Therefore,  nine 
potential  predictors  representing  past  zonal  and  meridional 
motions  were  included.  These  were  variables  5  through  13  in 
Table  VIII.  Each  prior  average  speed  or  vector  displacement 
was  based  on  JTWC  warning  positions  to  simulate  operational 
conditions. 


The  following  discussion  suggests  other  possible 
research  to  improve  the  operational  model: 

1.  The  forecast  scheme  could  me  improved  if  other  vari¬ 
ables  representing  physical  features  affecting  storm 
movement  could  be  identified  and  included  in  the 
regression  equations.  Intensity,  represented  by 
maximum  sustained  wind  speed,  was  found  to  be  an 
unimportant  predictor  in  both  this  study  and 
Shaffer's  (1982).  Following  Chan  and  Gray  (  1982), 
variables  such  as  the  size  of  the  cyclone  should  be 
tested  in  the  regression  analysis.  Model  verifica¬ 
tion  of  George  and  Gray's  (1976)  observation  that  the 
700  mb  level  best  specifies  cyclone  speed  and  that 
the  500  mb  level  best  specifies  cyclone  direction 
might  be  attempted. 

2.  The  EOF-basea  regression  forecast  scheme  is  not 
limited  to  input  of  coefficients  derived  from  anal¬ 
yses.  Coefficients  derived  from  prognostic  data 
fields,  such  as  a  24-hcur  forecast  from  a  dynamic 
numerical  prediction  model,  might  improve  the. longer 
range  f  .ecasts. 

3.  Each  EOF  coefficient  represents  the  contribution  of 
the  associated  eigenvector  to  the  total  forcing.  The 
resultant  tropical  cyclone  movement  is  a  summation  of 
the  total  forcing  by  all  uiodes.  Additional  insight 
into  the  more  important  modes  for  tropical  cyclone 
forcing  could  possibly  be  obtained  by  examination  of 
the  correlation  of  the  modes  with  the  tropical 
cyclone  movement. 

4.  Vertical  coupling  might  be  represented  in  the  EOF 
modes  for  the  three  levels.  Testing  would  involve 
the  development  and  analysis  of  regression  models 
using  EOF  coefficients  from  more  than  one  pressure 
level  in  various  combinations. 
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5.  The  zonal  and  meridional  components  of  tropica 
cyclone  movement  are  forecast  separately  by  the 
current  scheme,  even  though  tropical  cyclone  movement 
is  a  vector  guantity.  Correlations  exist  between  the 
zonal  and  meridional  components  of  motion  (Shapiro 
and  Neumann,  1984).  Improvements  might  arise  from 
inclusion  of  potential  predictors  which  account  for 
the  correlation  between  the  zonal  and  meridional 
components  of  motion.  Also,  an  operational  model 
might  be  improved  using  a  grid  rotated  along  the 
direction  of  cyclone  motion  (Shapiro  and  Neumann, 
1  S3 4)  . 

6.  A  vector  EOF  analysis  may  improve  the  identification 
of  forcing  modes  for  tropical  cyclone  movement. 
Rotation  of  the  eigenvectors  could  also  be  investi¬ 
gated  for  potential  improvement  of  tne  method.  As 
previously  noted,  more  eigenvectors  may  have  to  be 
retained  to  guard  against  under factor ing. 

The  EOF-regressicn  model  definitely  shows  promise  for 
improvement  of  operational  forecasts  of  tropical  cyclone 
movement.  This  simple  regression  model  performed  very  well 
on  dependent  data.  Additional  reductions  in  forecast  error 
may  be  possible  through  inclusion  of  more  sophisticated 
physical  forcing  parameters  and  prognostic  fields.  Further 
research  appears  warranted. 


Y  (277 


:?/7.a  i< 


8  KM) 


TABLE  XVI 


Intercept  and  regression  coefficients  for  the  meridional 
average  speed  equation  using  250  mb  EOFs. 


FORECAST  INTERVAL  {H) 


24 

48 

72 

INTERCEPT 

6.8262 

6 . 6448 

6.0899 

UCLD2 

-0 .0776 

.0 

.  0 

VOL02 

0.3989 

0.2656 

.0 

VOLD3 

.0 

.  0 

0.2255 

CJ1 

.0 

.  0 

0.2161 

CU5 

0.2181 

.0 

.  0 

C0 10 

.0 

0.5887 

.  0 

CU  2  5 

.0 

.  0 

-0.8033 

C'J2  5 

.0 

0 . 5864 

.  0 

CV  1 

-0  .  1544 

.0 

-0.  1733 

CV2 

-0.2741 

-0.2411 

.0 

CV  3 

0.2210 

.  0 

.0 

CV5 

0.2283 

.  0 

0.2425 

CV  6 

.0 

0.2313 

.0 

CV9 

-0.2988 

.0 

.  0 

CV  1  1 

.0 

.  0 

0.3307 

CV  1  2 

.0 

-0.2143 

.  0 

CV  1  6 

-0 .3347 

-0. 3239 

.  0 

CV  1  7 

.0 

-0.3017 

-0.3271 

CV  1  3 

.0 

0.2813 

0.3346 

CV2  3 

.0 

.  0 

0.3263 

CV  26 

.0 

-0. 4296 

.0 

TABLE  IV 


Intercept  and  regression  co ef fgcients  for  the  zonal 
average  speed  equation  using  250  mb  EOFs. 


FORECAST  INTERVAL  (H) 


24 

48 

72 

:  :<  T  F  R  C  E  ?  T 

0.0172 

-27.3931 

-  22.  8406 

CL  AT 

.0 

-0.4879 

-0.5434 

L.ON 

.0 

0. 2827 

0.2535 

•Jl.  L  j  2 

-C 

0. 3075 

0.2259 

L'  C  L  D  5 

0 . 5534 

.  0 

.  0 

J-  o  ?  A. 

0.0449 

.  0 

.0 

i :  j  ?  :• 

-0.0218 

.  0 

.  0 

C3  1 

0.4932 

0.3131 

0. 1727 

C'J  2 

-0.3763 

-0. 3331 

-0.2389 

CL  3 

0.4395 

.  0 

.0 

CJ  4 

.0 

0 . 343b 

.  0 

C'J  1  0 

.0 

.  0 

-0.7225 

C'J  1  2 

.0 

-0.5748 

.  0 

CJ2o 

.0 

.  0 

-1.0172 

CV  1 

.0 

0.5851 

0.7922 

CV3 

-0.3831 

-0.3128 

.0 

L  '/  6 

.0 

.0 

-0. 5246 

CV  9 

.0 

.  0 

-0.3744 

CV  1  1 

.0 

-0. 5210 

.0 

C71  o 

-0 

0.6694 

0. 9332 

CV  27 

.0 

-0.6646 

.  0 

CV2  3 

.0 

.  0 

0.6376 
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TABLE  XIV 


Intercept  and  regression  coefficient  for  the  meridional 
average  speed  eguation  using  400  mb  EOFs. 


FORECAST  INTERVAL  (H) 


24 

48 

72 

INTERCEPT 

6-3235 

7.8935 

7.1388 

VOLD2 

0-3743 

0.  2005 

.  0 

VOLD3 

.0 

.0 

0.1901 

CU2 

-0.1766 

-0.2360 

-0. 1 997 

CCJ5 

.0 

-0.  1852 

.0 

CU7 

.0 

-0. 2726 

.0 

CO  10 

.0 

0. 3693 

0.2350 

C01  3 

-0-5186 

-0.6135 

-0.4330 

C'J22 

.0 

.0 

0.5219 

C025 

0.7326 

0. 71 09 

1.0265 

CU29 

-0 

.  0 

-0 .7744 

CU3  1 

-0 

-0.6577 

-0.7560 

CU35 

.0 

.  0 

0.6557 

CV2 

-0 .2634 

-0.  1735 

.0 

CV4 

.0 

.0 

-0.2465 

CV7 

.0 

0. 2766 

0.2093 

CV8 

.0 

0.2584 

.0 

CV 1  2 

.0 

.  0 

0.2765 

CV  1  5 

.0 

.0 

0.2765 

CV  1 6 

.0 

-0.3967 

-0. 4005 

TABLE  IIII 


Intercept  and  regression  coefficients  for  the  zonal 
average  speed  equation  using  400  mb  20Fs. 


FORECAST  INTERVAL  (H) 


22 

48 

72 

INTERCEPT 

2.0174 

-15.7644 

-4.3064 

CLAT 

.0 

.0 

-0.4520 

CLON 

.0 

0.  1314 

0. 0974 

U  OLD  2 

.0 

0. 2379 

0. 1 377 

UCLD  3 

0 . 3266 

.0 

.  0 

DISP2 

0.0398 

.0 

.0 

DISP3 

-0.0190 

.0 

.0 

CU  1 

0.56  21 

0 . 6868 

0.7350 

CU2 

0.9062 

0.8759 

0.7496 

CO  6 

.0 

.0 

0. 4717 

CU  1  1 

.0 

.  0 

-0.5958 

CU25 

-0.9854 

-0.9190 

-1.2793 

CU2  3 

.0 

.0 

0.8767 

CV  1 

-0.4448 

.0 

.0 

CV2 

0.2838 

0. 4178 

0.3366 

CV  1  6 

.0 

.0 

0.3673 

CV1  8 

.0 

.0 

0.7152 

TABLE  XII 


Intercept  and  regression  coefficients  for  the  meridional 
average  speed  equation  using  700  mb  EOFs. 


FORECAST  INTERVAL  (H) 


24 

48 

72 

INTERCEPT 

6.0283 

6 . 0252 

5.4471 

UOLD1 

.0 

.  0 

0. 1836 

VOLD2 

0.3738 

0.  1458 

.0 

VOLD3 

.0 

.0 

0.2115 

DISP2 

.0 

.  0 

0.0133 

DISP3 

.0 

0.0048 

-0.0122 

CU1 

.0 

.  0 

0.  1048 

CO  2 

.0 

-0.  1964 

-0.1919 

CU4 

-0.1490 

-0. 1379 

.0 

CU5 

.0 

0.  166  1 

.  0 

CU6 

-0.2689 

.0 

.0 

C'J7 

.0 

-0. 4349 

-0 . 4  1 06 

CU8 

.0 

0.4143 

.0 

CU12 

.0 

-0. 3181 

.  0 

CO  2  2 

.0 

-0.3317 

.0 

CU25 

0.5253 

0.7267 

0.6355 

CU26 

.0 

.0 

0.5146 

CV2 

-0.4647 

-0.3718 

-0.3523 

CV3 

0.2861 

.  0 

0.1796 

CV7 

.0 

-0.2651 

-0.2476 

CV8 

.0 

-0.3642 

-0.2000 

CV9 

.0 

-0.2030 

.0 

CV 1  3 

.0 

0. 2525 

.0 

C714 

.0 

.  0 

-0.2473 

CV  1 6 

0.3149 

.0 

.  0 

77 


TABLE  II 


Intercept  and  regression  <?oef figients  for  the  zonal 
average  speed  eguation  using  700mb  EOFs. 


FOREC  AST  INTERVAL  (H) 


24 

43 

72 

INTERCEPT 

2.7082 

4.8142 

5.99  13 

UOLD2 

.0 

0. 3399 

0.2099 

UOLD3 

0.2632 

.0 

.0 

V0LD3 

-0.4945 

-0.3005 

-0.2596 

DISP2 

0.0256 

.  0 

.0 

CU1 

0.2820 

0. 2456 

0.2643 

CU2 

0.7176 

0.6637 

0.6653 

CUS 

.0 

.  0 

-0. 5090 

CU7 

.0 

0.3342 

0.5877 

CU8 

-0.3730 

.  0 

-0.3206 

CU  1 4 

.0 

-0.6074 

.0 

CU 1  6 

.0 

0.5874 

.0 

CU1  8 

.0 

.  0 

-0.6714 

CU23 

.0 

.0 

0.6666 

CU24 

.0 

.  0 

-0.7699 

CTJ26 

.0 

-0.7092 

-1. 1047 

CU27 

.0 

.0 

1. 0278 

CV1 

-0.3872 

.  0 

.  0 

CV2 

.0 

0.3031 

.0 

CV  4 

-0.3112 

-0.4360 

.  0 

CV7 

.0 

0. 3337 

0.3179 

CV2  3 

.0 

.0 

0.6105 

CV23 

0.7880 

.0 

.0 

76 


TABLE  IX 


Sample  size  and  R2  by  forecast  time  and  level 
for  the  zonal  and  meridional  equations. 


FORECAST  INTERVAL  (H) 


24 

48 

72 

NUMBER 

i  OF 

DEPENDENT 

4  09 

308 

232 

CASES 

ZONAL  EQUATIONS 

700 

mb 

0.647 

0.708 

0.637 

400 

mb 

0.612 

0.637 

0.6  13 

250 

mb 

0.623 

0.607 

0.589 

MERIDIONAL  EQUATIONS 

700 

mb 

0.475 

0.439 

0.447 

400 

mb 

0.492 

0.408 

0.  336 

250 

mb 

0.481 

0.  440 

0.325 

TABLE  X 


Means  and  standard  deviations  of 
for  the  dependent 


the  predictands 
sample. 


(km/h) 


MEAN 

STA  NE ARD 
DEVIATION 

MEAN 

STANDARD 

DEVIATION 


FORECAST  INTERVAL  (H) 

24  43  72 

ZONAL  AVERAGE  SPEED 
8.2  8.9  9 .  ft 

14.1  11.8  10.3 

MERIDIONAL  AVERAGE  SPEED 
8.6  8.3  7.8 

8.0  6.7  5.7 
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TABLE  VIII 

Potential  predictors  for  regression  analysis 


POTENTIAL  PREDICTOR 


VARIABLE  NUMBER 

NAME 

1 

DATE 

2 

CLAT 

3 

CLON 

4 

CINT 

5 

U0LD1 

6 

UOLD2 

7 

U0LD3 

8 

VOLD  1 

9 

VOLD2 

10 

VOLD  3 

11 

DISP1 

12 

DISP2 

13 

DISP3 

14  to  48 

CO  1 

to  CD3 5 

49  to  83 

CV  1 

to  CV3  5 

DESCRIPTION 


Julian  date. 

Warning  position 
latitude. 

Warning  position 
longi tude. 

Maximum  sustiained 
wind  speed  (kts)  . 

Average  zonal  cyclone 
movement  from  24  to  12  h 
before  base  time  (m/s)  . 
Average  zonal  cyclone 
movement  for  12  h  before 
base  time  (m/s) . 

Average  zonal  cvvione 
movement  for  24* h  before 
base  time  (m/s)  . 

Average  meridional  cyclone 
movement  from  24  to  12  h 
before  base  time  (m/s). 
Average  meridional  cyclone 
movement  for  12  h  berore 
base  time  (m/s)  . 

Average  meridional  cyclone 
movement  for  24  h  berore 
base  time  (m/s) . 

Vector  displacement  for 
24  to  12  h  before 
base  time  (a)  . 

Vector  displacement  for 
12  h  before  base  time  (m)  . 
Vector  displacement  for 
24  h  before  base  time  (m)  . 
EOF  coefficients  derived 
for  zonal  modes  1  to  35. 
EOF  coefficients  derived 
for  meridional  modes 
1  to  35. 
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TABLE  71 


Summary 

of  the  number 

:  cf  modes 

retained  and 

percentage 

of  variance  c 

lescribed 

(in  parentheses)  . 

ZONAL 

MERIDIONAL 

700  mb 

24  (84.7) 

35  (81.7) 

40  0  mb 

21  (36.05 

33  82.4 

250  mb 

19  (87.0 

29  (32.0 

TABLE  III 

Percentages  of  variance  accounted  for 
when  35  modes  are  retained  for  all  wind  component  fields. 


ZONAL 

MERIDIONAL 

700 

mb 

90.0 

81.7 

40  0 

mb 

92.  1 

83.6 

250 

mb 

93.  2 

85.4 
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TABLE  V 

Mean  eigenvalues  and  95  percent  confidence  levels 
as  computed  by  the  Monte  Carlo  technique. 


MEAN 

MEAN  EIGE 

MCDZ 

EIGENVALUE 

PLUS  TWICE  THE  STA 

1 

3.424 

4.116 

2 

3.318 

3.989 

3 

3.299 

3.966 

4 

3. 26  1 

3.9  19 

5 

3.228 

3.  380 

6 

3.211 

3.859 

7 

3.  192 

3 . 837 

8 

3.139 

3.  773 

9 

3.110 

3.738 

10 

3.101 

3.728 

11 

3.086 

3.710 

12 

3.048 

3 .  o  64 

13 

3.031 

3.643 

14 

2.986 

3.589 

15 

2.970 

3.570 

16 

2.963 

3.561 

17 

2.930 

3.  522 

18 

2.917 

3.  507 

19 

2.900 

3.486 

20 

2.872 

3.  452 

21 

2.850 

3.425 

22 

2.816 

3.385 

23 

2.808 

3.375 

24 

2.797 

3.362 

25 

2.774 

3.  334 

26 

2.766 

3.325 

27 

2.743 

3.297 

28 

2.723 

3.  273 

29 

2.697 

3.242 

30 

2.681 

3.  222 

31 

2.663 

3.200 

32 

2.643 

3.  177 

33 

2.636 

3.  168 

34 

2.627 

3.  158 

35 

2.619 

3.  149 

36 

2.591 

3.114 

100 

1.817 

2.  184 

300 

0.569 

0 .  0  8  4 

400 

0.238 

0. 286 

527 

0.0  16 

0.019 
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TABLE  IV 


I 


I 


r- 

I 


f 


250  mb  component  wind  fields:  eigenvalues  and 
cumulative  percentage  of  variance  (in  parentheses). 


MODE 


250  mb  ZONAL 


250  mb  MERIDIONAL 


1 

160.599 

30. 

5) 

2 

8  4.874 

46. 

T 

3 

48.568 

55. 

9 

4 

31.797 

61. 

9 

5 

2  1.493 

66. 

0' 

6 

16.544  i 

69. 

2 

7 

13.239 

7  1. 

7) 

8 

1  1.062 

73. 

8 

9 

9.858 

75. 

6) 

10 

9.373 

77. 

4  1 

11 

8.456 

79. 

o 

12 

7.832 

80. 

5 

13 

6.523 

8  1. 

8 

14 

5.982  i 

32. 

9 

15 

5.147 

83. 

9) 

16 

4.687 

84. 

3 

17 

4.287 

[85. 

6 

18 

4.004  i 

86. 

3 

19 

3.625 

87. 

0 

20 

3.299 

'87. 

7 

21 

2.974 

83. 

2' 

22 

2.812 

83. 

3 

23 

2.661 

89. 

3 

24 

2.369 

89. 

7 

25 

2.234 

[90. 

1 

26 

2.070 

’90. 

5 

27 

2.061 

90. 

0) 

28 

1.897 

9  1. 

3 

29 

1.703 

9  1. 

6) 

30 

1.610 

9  1. 

9 

31 

1.542 

92. 

2 

32 

1.420 

92. 

5 

33 

1.354 

'92. 

7' 

34 

1.339 

93. 

oj 

35 

1.266 

93. 

2) 

36 

$ 

1.212 

[93. 

5) 

100 

$ 

0.146 

(98. 

7) 

300 

4c 

0.004 

(99. 

9) 

527 

0.000 

(100 

•  ) 

49. 187 
41. 620 
28.325 
26. 942 
24. 028 
20. 635 
13.627 
16. 555 
15.219 
12. 888 
1  2.  162 
1 1. 283 
10. 369 
10. 050 
8.  838 
8.  -2  53 
7.  214 
6.815 
6.808 
6.  203 
5. 883 
5.  232 
4.881 
4.  361 
4.416 
4.  074 
3.  870 
3. 533 
3.410 
3.212 
3.  121 
2.950 
2.858 

2.  566 

3.  121 

0.314 
0.  00  9 
0.  000 
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TABLE  III 


400  mb  component  wind  fields;  eigenvalues  and 
cumulative  percentage  of  variance  (in  parentheses). 


MODE  400  mb  ZCNAI 


2  86.3  14  41.  7 ' 

3  45.448  i' 5 0.  3 ) 

4  30.134  (56. O' 

5  24.910  160.  8 

6  17.362  64. 1) 

7  14.977  66.9) 

8  13.151  (69.4) 

9  10.865  7  1.  5) 

10  9.788  i 73. 3) 

11  8.807  75.0) 

12  8.504  (76.6) 

13  7.924  78.1) 

14  6.862  79.4 

15  6.406  80.6' 

16  6.091  (81.  8) 

17  5.738  82. 9 ) 

18  4.839  (83.8 

19  4.181  |  8  4. 6  ) 

20  3.727  ’85.3 

21  3.630  (86.  0) 

22  3.276  .86.6) 

23  3.  10  1  (87.2 

24  2.938  (87.8’ 

25  2.670  '88.3 

26  2.653  (88.  8' 

27  2.522  (89.3) 

28  2.385  (89.  7) 

29  2.  155  (9  0.  1 } 

30  1.984  (90.5 

31  1.816  <90 . 8) 

32  1.772  (91.2' 

33  1.703  (9  1.5 

34  1.534  (9  1.  8) 

35  1.509  (92.1 

36  1.437  (92.  4) 

* 

00  0.152  (98.6) 

* 

100  0.004  (99.  9) 

.27  0.000  (100.) 


400  mb  MERIDIONAL 


47. 

827  ( 

9. 

1 

44. 

391 

;i7. 

5' 

28. 

,707 

23. 

O' 

26. 

999 

'28. 

1! 

24. 

486  1 

'32. 

8' 

22. 

029  ( 

37. 

0' 

20. 

417 

’40. 

8' 

19. 

241 

'44. 

5‘ 

16. 

344  | 

47. 

6 

15. 

,  236  1 

50. 

5' 

14. 

514  ( 

53. 

3' 

13. 

844  ( 

55. 

9; 

12. 

340  I 

58. 

2 

1  1. 

115  ( 

60. 

3, 

10. 

,  677  | 

62. 

4 

9. 

,  888  | 

64. 

T 

8. 

965  | 

66. 

6' 

7. 

,820  | 

67. 

4 

7. 

,768  | 

68. 

9 

6. 

,  984  | 

70. 

2 

6. 

,  562  I 

71. 

5' 

6. 

,  366  i 

72. 

7 

6. 

,  065  | 

73. 

9 

5. 

,  548  | 

74. 

9 

5. 

,  456  i 

75. 

9 

5. 

,  351  | 

77. 

0 

4. 

,  783  | 

77. 

9 

4. 

,  452  i 

78. 

7 

4. 

,201  i 

79. 

5 

4. 

.  071  . 

80. 

3 

3. 

.  885  i 

81. 

0 

3. 

,  746  i 

8  1. 

7 

3. 

,478  i 

82. 

4 

3. 

.  278  i 

33. 

0 

3. 

,  063  | 

83. 

6 

2. 

.  924  . 

[84. 

2 

0. 

.  340 

(97. 

0 

0, 

.  009 

(99. 

g 

0. 

.  000 

(10C 

1 . 
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TABLE  II 


l 


700  mb  component  wind  fields:  eigenvalues  and 
cumulative  percentage  of  variance  (in  parentheses). 


MODE 

7  00  mb 

ZONAL  700  mb  M 

1 

95.157  | 

’13.1] 

1  41.438 

2 

72.772 

3  1.9' 

1  38.417 

3 

44.161  i 

4  0.  3 ' 

i  29.189 

4 

34.540  I 

46.  9 

1  25.  566 

5 

31.632  | 

5  2.  9 

l  24.171 

6 

20.146  | 

56.  7 

i  20.9  59 

7 

16.750  i 

[59.  9 

1  19.123 

8 

1  5.459  | 

[62.  8< 

|  18.485 

9 

13.343  | 

65.  4 

i  16.477 

10 

1  2.325  i 

6  7.  7 

I  15.751 

1  1 

10.354  | 

6  9.  7 

|  14.613 

12 

9.312  | 

7  1.  5 

1  12.695 

1  3 

8.989  j 

73.2 

I  11.757 

14 

8.4  17 

74.  8 

)  11.478 

15 

7.488 

76.  2 

|  10.429 

16 

6.715  i 

77.  5< 

|  10.171 

17 

6.390  | 

78.  7 ! 

1  9.090 

18 

5.701  i 

79.  8 

1  8.595 

IS 

5.173 

[80. 7' 

)  8.424 

20 

4.931  i 

81.7 

)  7.539 

21 

4.373  | 

82.5 

1  7.  136 

22 

4.244  i 

'83 . 3 

)  7.011 

23 

3.801  i 

84.  0 

)  6.492 

24 

3.445  I 

'84.  7 

1  5.816 

25 

3.244  | 

'85.3] 

|  5.688 

26 

3.174  i 

85.  9 

)  5.518 

27 

2.839 

86.  5 

|  5.103 

28 

2.790  ■ 

8  7.  0 

4.  8o5 

29 

2.663  i 

>87.5] 

)  4.611 

30 

2.482  1 

8  8.  0  ] 

)  4.405 

31 

2.403  i 

'88.4 

1  4.254 

32 

2.237  i 

88.  8 

|  4.070 

33 

2.205  \ 

89.3] 

|  3.751 

34 

1.990  i 

8  9.6 

|  3.445 

35 

1.913 

90.0 

|  3.207 

36 

£ 

1.809  1 

90.  4 

3.039 

100 

* 

(93.  2)  0.397 

300 

* 

0.005 

(99.9)  0.012 

527 

0.000 

(100.)  0.000 

69 


APPENDIX  B 
TABLES 


TABLE  I 

Operational  models  f cr  the  prediction 
of  tropical  cyclone  motion  ever  the  North  Atlantic.* 


MODEL 

TYPE  MODEL 

DESCRIPTION 

HUEfiAN 

STATISTICAL 

Analog  model  based  on  tracks  cf  all 
Atlantic  tropical  cyclones  since 
1886.  (Operational  1968) 

CLIPEE 

STATISTICAL 

Regression  equation  model  utilizing 
predictors  derived  from  climatology 
and  persistence.  (Operational  1971) 

NIIC67 

STATISTIC AL- 
SYNOP  TIC 

Regression  eguation  model  utilizing 
predictors  derived  from  climatology 
persistence  and  observed  geoDOten- 
tial  height  data. 

(Operational  1967) 

NHC7  2 

STATISTICAL- 

SYNOPTIC 

Regression  equation  model  utilizing 
predictors  derived  from  output  of 
CLIPER  model  and  observed  geopoten¬ 
tial  height  data. 

(Operational  1972) 

NHC7  3 

STATISTICAL- 

DYNAMICAL 

Regression  eguation  model  utilizing 
predictors  derived  from  output  or 
CLIPER  model,  observed  and  numeri¬ 
cally  forecast  geopotential  height 
data.  (Operational  197  3) 

SAN3AE 

DYNAMICAL 

Barotrcpic  model  based  on  pressure- 
weighted  wind  field  averaged 
through  troposphere  and  represented 
on  a  154  km  (at  22.5  N)  spaced 
grid.  (Operational  1970) 

MFM 

DYNAMICAL 

Movable  Fine  Mesh  (MFM)  baroclinic 
model  having  10  levels  in  the  ver¬ 
tical  and  60  km  grid  spacing  in  the 
horizontal.  (Operational  1976) 

*  (from  Neumann  and  Pelissier,  IS 8 1 ) 


.TABLE  XVII 


Mean  and  standard  deviation  (km)  forecast  vector  errors 
for  the  dependent  sample. 


FORECAST  INTERVAL  (H) 


24 

43 

72 

NUM3EE  OF 

DEP  SN  DEN  I 

LATA  CASES 

409 

308 

232 

MEii N  VECTOR  ERROR 

70  0  mb 

40  0  mb 

25  0  mb 

200.  7 

1  89.3 
204.0 

351.7 

349.0 

365.  4 

465. 
453. 
491  . 

STANDARD  DEVIATION 

700  mb 

400  mb 

250  mb 

1  34.9 
131.5 

138.0 

217.  2 

225.2 

23  1.  7 

256. 

297. 

158. 
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